Expected Findings
- 500 responses include the full prompt the model received.
- Error payload exposes retrieved RAG chunks tagged with their tenant.
- Stack trace identifies internal model and routing endpoints.
When the assistant 500s, the error body returns the rendered prompt context for debugging, which leaks the system prompt and any retrieved chunks.
CWE-209CWE-200
return res.status(500).json({ error: e.message, prompt: renderedPrompt, context: chunks })