Why API errors should never leak a stack trace

Every endpoint has two outputs you design carefully — the success response and the request validation — and a third you usually do not design at all: what happens when something throws. That third output is where most APIs quietly hand attackers a head start, because the default behaviour of almost every web framework is to be helpful when it fails. Helpful to the developer, and equally helpful to anyone probing your service.

A stack trace is not a security breach by itself. It is something more useful to an attacker than that: a map. It turns blind probing into targeted work, and it is handed over for free, on a route that often is not even authenticated.

A stack trace is an architecture diagram you did not mean to publish — framework, versions, file paths, and the shape of your data, all in one response body.

What a stack trace actually gives away

Consider a single unhandled exception in a typical Python service. The response body an attacker receives might look like this:

HTTP 500 · response body
Traceback (most recent call last):
  File "/opt/app/api/orders.py", line 218, in get_order
    row = db.fetchone(query, (order_id,))
  File "/usr/local/lib/python3.11/site-packages/psycopg2/extras.py", line 144
psycopg2.errors.UndefinedColumn: column "totl" does not exist
LINE 1: SELECT id, totl, status FROM orders WHERE user_id = ...

Read it the way an attacker does. Before they have sent a second request, they now know: you run Python 3.11; you use PostgreSQL through psycopg2; your code lives under /opt/app and is organised by resource (api/orders.py); there is an orders table with at least id, status, and a misspelled totl column; and queries are scoped by user_id. That last detail is the interesting one — it tells them authorisation is enforced in the query, which is exactly the thing they will now try to bypass.

Every line is reconnaissance. Framework and language versions map to published CVEs. File paths reveal your deploy layout and let an attacker guess other routes. The query fragment leaks your schema. The frame names describe your control flow. None of it required a vulnerability — you printed it.

From traceback to exploit

The danger is not the single error; it is what the error enables. A leaked version number is matched against known advisories for that exact release. A file path like /opt/app/api/orders.py implies api/users.py, api/admin.py, and a dozen other endpoints worth probing. A leaked column name turns a blind SQL injection attempt into a targeted one, because the attacker no longer has to guess your schema — you told them.

This is why information disclosure sits in the OWASP API Security Top 10 even though, on its own, a 500 page does nothing. It is a force multiplier. It converts a slow, noisy, easily-detected campaign of guessing into a quiet, precise sequence of requests that look almost legitimate. The error response is the difference between an attacker working in the dark and an attacker working from your source tree.

Why frameworks leak by default

The single worst offender is debug mode. Flask’s interactive debugger, Django’s yellow error page, and their equivalents are built to expose everything — local variables at each frame, the full source context, sometimes an interactive console. They are superb in development and catastrophic in production. Debug mode left on is not a leak; it is an open door.

But turning debug off is not the end of it, and this is the part teams miss. With debug disabled you stop shipping the interactive debugger, yet the application code underneath often still leaks, because somewhere a handler does the convenient thing:

Python · the convenient mistake
# Looks harmless. Leaks everything.
@app.errorhandler(Exception)
def handle(e):
    return jsonify({
        "error": str(e),
        "type": type(e).__name__,
        "trace": traceback.format_exc()
    }), 500

The intent is obvious and well-meaning — someone wanted useful errors during integration and never took it out. The result is that the response now carries the exception class, its message, and the entire traceback, with debug mode having nothing to do with it. The leak is your code, not the framework.

Two separate jobs. Disabling debug mode stops the framework from volunteering internals. Writing a disciplined error handler stops your own code from doing the same. You need both; doing one and assuming you are covered is the common failure.

One handler, a generic body, a correlation ID

The fix is a single application-level handler with one rule: log everything on the server, return almost nothing to the client. The only thing that should cross the boundary is a short, opaque identifier that lets you find the full detail in your own logs later.

Python · the handler that says nothing useful
import uuid
from werkzeug.exceptions import HTTPException

@app.errorhandler(Exception)
def handle_exception(e):
    # Let intentional HTTP errors (404, 405, redirects) pass through untouched
    if isinstance(e, HTTPException):
        return e

    # Anything else is genuinely unexpected: log it fully, return only an ID
    request_id = uuid.uuid4().hex[:12]
    log.exception("Unhandled exception [%s] on %s %s",
                  request_id, request.method, request.path)
    return jsonify({
        "error": "Internal server error",
        "request_id": request_id
    }), 500

The client receives a fixed string and a random twelve-character ID. That ID appears verbatim in the server log alongside the full traceback, the route, the method, and whatever request context you choose to record. When a user reports a problem, support asks for the ID and reads the complete story internally — without a single byte of that story ever having left the building. The attacker, meanwhile, gets a value that means nothing and reveals nothing.

The subtle mistake: do not swallow your own HTTP errors

Here is the part that is easy to get wrong, and it is the reason a catch-all handler can quietly make things worse. A handler registered against Exception catches everything — including the exceptions your framework raises on purpose. In Flask, a 404, a 405 Method Not Allowed, a 401, and even a redirect raised via abort() are all HTTPException subclasses travelling through the same machinery as a real crash.

If your handler blindly converts every exception into a generic 500, it does not just hide internals — it breaks correct behaviour. A request for a route that genuinely does not exist stops returning a clean 404 and starts returning a 500. Clients and caches that rely on accurate status codes get confused. Your own monitoring fills with 500s that are not actually errors. You have made your error handling both less correct and noisier in the name of making it safer.

The single line that prevents it is the isinstance(e, HTTPException) guard above: detect the errors you raised deliberately, hand them straight back unchanged so they keep their intended status code and body, and reserve the opaque-500 treatment for the exceptions you did not see coming.

The distinction that matters. Errors you raised on purpose are part of your API contract — a 404 means “not here,” and the caller is meant to see it. Errors you did not raise on purpose are the ones to hide. A good handler can tell the two apart; a naive one treats them identically and gets both wrong.

Log everything inside, return nothing outside

The mental model is an asymmetry. Inside the trust boundary — your logs, your error tracker — you want maximum detail: the full traceback, the correlation ID, the route, the authenticated user where relevant, the parameters that triggered it. Across the boundary — in the HTTP response — you want the minimum that still lets a legitimate client behave correctly: the right status code, a generic message, and the ID.

Two cautions on the logging side. Detail is not licence to record secrets — passwords, tokens, full card numbers, and decrypted user data do not belong in a log line even when an exception makes them convenient to capture, so scrub them at the logging layer. And the correlation ID must be random, not sequential: a counter tells an attacker how many errors you have served and invites enumeration, while a random value is useful only to whoever already holds it.

Checking your own API

You can audit this from the outside in a few minutes. Provoke the three classes of failure and read the response bodies, not just the status codes.

Shell · what good and bad look like
# Unknown route — want a clean, generic 404
$ curl -s https://api.example.com/v1/does-not-exist
{"error": "Not found"}

# Forced server error — want an opaque 500 with only an ID
$ curl -s https://api.example.com/v1/orders/%00
{"error": "Internal server error", "request_id": "a1b2c3d4e5f6"}

# RED FLAG — never ship a body like this:
{"error": "column totl does not exist",
 "type": "UndefinedColumn",
 "trace": "Traceback (most recent call last)..."}

The red flags are concrete: the word Traceback anywhere in a body, a type or exception field, absolute file paths, framework or server version banners, and any fragment of SQL or a query. The signs of a healthy API are equally concrete — a short generic message, the correct status code for the situation, and at most a random request ID.

The table below is the quick reference for deciding what is allowed to cross the boundary.

Information	Return to client?	Where it belongs
Generic error message	Yes	Response
Correct HTTP status code	Yes	Response
Random correlation / request ID	Yes	Response and log
Stack trace / traceback	No	Server log only
Exception type and message	No	Server log only
File paths and line numbers	No	Server log only
Framework / server version banners	No	Suppress entirely
SQL or query fragments	No	Server log (sanitised)
Parameter / variable values	No	Server log (no secrets)

The takeaway

Treat the error path with the same care as the success path. A 500 is an API response, and it deserves to be designed rather than inherited from a default. The discipline is small and it is mostly one handler: catch the unexpected, log it in full where only you can see, and return an opaque identifier and nothing else.

And then test the part everyone forgets — that your handler still lets a 404 be a 404. The guard that distinguishes the errors you raised on purpose from the ones you did not is the difference between an error handler that protects you and one that silently corrupts your status codes while it does it.

How HexVault handles it. Our API returns a generic message and a request ID on unexpected failures, logs the full detail server-side against that ID, and passes intentional HTTP errors through untouched. The security page walks through the wider architecture, and our error responses are something you can probe yourself — you should never be able to make one tell you anything about how the service is built.