Most SaaS products treat transactional email as a solved problem. Pick a provider, drop in a library, call a function with the recipient and a template variable or two. Done.
In a zero-knowledge credential vault, it’s genuinely more complex — and the failure mode is particularly insidious: the email appears to send, the function returns successfully, but the recipient sees garbled content, a broken link, or nothing useful at all.
This post covers a complete audit of HexVault’s transactional email layer: 64 call sites, 26 distinct functions, 8 bugs that had been silently corrupting notifications ranging from password-share invites to breach alerts to MPA approvals.
The problem with email in a zero-knowledge system
In a conventional SaaS product, the server has full access to user data. You pull whatever you need from the database to populate a template: name, email address, account details, recent activity.
In a zero-knowledge vault, the server has access to metadata — membership records, timestamps, notification flags, org names, email addresses — but not to the contents of anything the user has actually stored. You can send a breach alert saying “some of your passwords are exposed”, but you can’t enumerate which ones from the server. You can notify a user they’ve been offboarded, but you can’t include a list of shared credentials they had access to, because the server doesn’t hold the key to decrypt them.
This shapes every email in the system. Breach alerts have to aggregate counts from the breach-check layer, which has server-side visibility into which credentials are flagged, separately from their contents. Share notifications have to work without the server knowing what password is being shared — it knows the name and ownership, not the value. Offboarding emails have to provide actionable guidance using only metadata the server can legitimately read.
The practical effect is that each email function tends to be a bespoke interface rather than a generic template call, and bespoke interfaces accumulate subtle argument mismatches over time.
The audit: 64 calls, 8 bugs
The audit was triggered by a user report that offboarding notifications weren’t arriving. Tracing that led to a broader question: how confident are we that the other 25 email functions are correct? The answer turned out to be “less confident than we should be”.
The approach was systematic: extract every function signature from email_service.py, then scan every call site in app.py and compare argument counts and types. Twenty-six distinct functions. Sixty-four call sites. Eight mismatches causing either silent failures or garbled output:
| Function | Bug type | Impact |
|---|---|---|
send_invite_email |
Missing arg | Invite button had no URL |
send_share_link_email |
Type mismatch | Raw datetime passed as expiry_hours int; template crash |
send_contact_email |
Wrong arg order | Visitor’s name used as destination email address |
send_security_notification_email (wrapper) |
Type mismatch | Dict passed as details string; template crash |
send_security_notification_email (2FA) |
Missing args | TypeError at call time; reminders never sent |
send_emergency_access_invite |
Type mismatch | wait_days int as invite_url; button showed 7 |
send_breach_alert_email |
Extra arg + type | Int count passed as list; len() on int threw TypeError |
send_mpa_notification |
Type mismatch | action_id int as approve_url; button showed raw integer |
Six of the eight bugs were silent: garbled email content or tracebacks caught by a broad except Exception, logged as a warning, and never surfaced. The other two caused TypeError at call time — slightly more visible, but only if you were watching the logs for that specific email type.
Four categories of silent failure
Looking across all eight bugs, they fall into four distinct patterns, each requiring a different prevention approach.
1. Wrong argument order
The send_contact_email wrapper was the worst offender. The email_service.py signature is:
def send_contact_email(to_email, from_name, from_email, subject, message, topic=class="tok-st">'general'):
The local wrapper in app.py called it as:
# app.py — local wrapper (buggy) send_contact_email(name, email, subject, message, organisation)
Five positional arguments, completely wrong mapping: name became to_email (the internal destination address), email became from_name (a display name), and subject was treated as an email address. Every contact form submission was attempting to send to the visitor’s name as if it were an email address.
The fix requires understanding the intent of the function. The email service sends a contact form to an internal inbox. The wrapper needs to resolve that inbox address from an environment variable and map the form fields correctly:
def send_contact_email(name, email, subject, message, organisation=class="tok-st">''): to_email = os.environ.get(class="tok-st">'SUPPORT_EMAIL') or class="tok-st">'[email protected]' topic = organisation or class="tok-st">'general' return _send_contact(to_email, name, email, subject, message, topic)
2. Type mismatches
The send_share_link_email bug is the clearest example. The email service expects expiry_hours as an integer, used directly in the template: “This link expires in {expiry_hours} hours.”
The call site was passing expires_at — a raw Python datetime object from the database. The email would render “This link expires in 2026-04-13 14:22:00 hours.” when it didn’t crash entirely.
The fix converts the datetime to hours-remaining at call time, with a floor of 1 and a default of 24 on failure. The database stores an absolute timestamp; the email template wants a human-readable duration. The wrapper is the right place to do that conversion — not the email service itself, which should remain generic.
The MPA notification and emergency access invite bugs follow the same pattern: an integer meaningful in the database context (an action ID, a wait time in days) was passed into a parameter the email template treats as a URL string. Both produced buttons with href="7" or href="42".
3. Missing arguments
The 2FA reminder notification called send_security_notification_email with four positional arguments when the function requires six. This is a Python footgun: if a function has def f(a, b, c='', d='', e=''), you can call f('x', 'y') cleanly, but passing positional arguments that skip intermediate defaults will cause a TypeError.
The send_invite_email wrapper omitted the invite_url parameter entirely — the fourth and final required argument. Every password-share invite had a button with an empty href. It looked correct in the email preview, but clicking it went nowhere.
4. The local wrapper trap
Several bugs existed inside local wrapper functions defined at module scope in app.py. These wrappers served legitimate purposes — database lookups, ID-to-URL resolution, added logging — but they created an abstraction layer where the interface diverged from the underlying function over time.
The send_security_notification wrapper accepted a user_id and a dict of event details, looked up the email address, then called the service. But the service’s details parameter is a string. The wrapper passed the dict directly:
# event_details is a dict — details param expects str
return send_security_notification_email(
email, username, event_type,
event_details or {} # ← TypeError waiting to happen
)The fix serialises the dict at the wrapper boundary, extracting the IP address as a dedicated field:
ip_str = str(event_details.get('ip', '') or '')
details_str = '; '.join(
f"{k}: {v}" for k, v in event_details.items()
if k not in ('ip', 'ip_address')
)
return send_security_notification_email(
email, username, event_type, details_str, ip_str
)The underlying pattern: a wrapper with a different signature than the function it wraps creates a gap that static analysis won’t catch and tests rarely cover. Six months after it’s written, someone updates the email service and doesn’t update the wrapper. Everything compiles. The only signal is the email recipient receiving something wrong.
How to prevent this class of bug
Four practical approaches, roughly in order of implementation effort:
-
Type annotations on email service functions. Adding type annotations to
email_service.pyenables static analysis tools like mypy or Pyright to flag mismatches at development time. A function annotated asdef send_share_link_email(..., expiry_hours: int = 24)will surface thedatetime-as-int bug before it reaches production. -
Keyword-only arguments for critical parameters. Python’s
*separator forces keyword-only arguments, making positional order mismatches impossible:def send_mpa_notification(to_email, username, *, action_description, approve_url). Passing an integer asapprove_urlpositionally becomes aTypeErrorimmediately. -
Systematic call-site auditing. The audit described here was done manually, but it can be automated: parse every function signature from the email service module, check every call site, compare argument counts. Under 50 lines of Python using the
astmodule. A CI step running this check would catch the entire class of bugs before production. -
Minimal wrapper policy. A code review rule: local wrappers around email service functions keep the same signature as the underlying function. Any additional logic — DB lookups, type conversion, URL construction — happens before the call, not inside the wrapper. If the signature diverges, treat it as a deliberate design decision requiring explicit review.
What zero-knowledge changes about email
The bugs described above can happen in any SaaS product with a non-trivial email layer. But zero-knowledge products have two properties that make them more likely to accumulate this kind of debt:
The email service can’t pull template variables from the vault. Every piece of context in an email has to be explicitly passed from the call site. There’s no “just fetch the user’s recent activity” template interpolation — the server can’t decrypt recent activity. This pushes more logic into call sites and more parameters into every function signature, creating more places for argument order and type errors to appear.
Email is often the only out-of-band communication channel. If a user is offboarded while they’re not logged in, or a breach alarm fires at 3am, email is the primary signal that something happened. Broken emails in a zero-knowledge product don’t just mean a poor user experience — they mean security-relevant events that nobody was notified about.
The emergency access invite bug illustrates this well: the email was sent, the function returned success, but the recipient’s email showed a button labelled “View Emergency Access” with href="7". The feature looked operational from the inside. From the outside, a user who genuinely needed emergency vault access would click a broken link and get nothing.
If you’re building a zero-knowledge product, budget time for a systematic email audit early — before users depend on breach alerts, offboarding notifications, and MPA approvals. Retrofitting type annotations and call-site verification is cheaper than debugging garbled security emails after they’ve been silently broken for months.