Technical note

Designing a Review Queue for Revenue-Risk Support Messages

Support inboxes hide business risk in plain sight.

A customer opening a support request due to "the copy button is broken" isn't the same kind of problem as a paid customer saying "we paid the invoice but our account is locked." Both are support messages but only one suggests a possible billing or entitlement failure.

I built Revenue Risk Inbox to test a specific product question: what would a support queue look like if it separated low-priority messages from cases that might affect payment, access, renewal confidence or churn?

Revenue Risk Inbox showing analyzed support messages

The problem

A support inbox usually flattens different kinds of work into the same row pattern: subject, customer, company, received time.

That works until the inbox contains a mix of failed payments, paid accounts locked out of the product, invoice disputes, cancellation threats, product bugs and low-risk feedback.

Those messages shouldn't be reviewed with the same urgency or be automated blindly.

The main point wasn't whether AI can summarize the message. It was whether the interface can help a reviewer see which messages carry revenue risk, why the AI thinks so and where human review is still required.

Can the interface make revenue-risk support messages easier to detect without hiding the decision from the reviewer?

The workflow

Revenue Risk Inbox models a small support-ops workflow.

A reviewer loads a batch of support messages, runs analysis, then scans the inbox by urgency, revenue risk, category and triage status. Opening a message shows the original customer request along with the AI summary, evidence and recommended next step.

The AI contract

The app sends support requests to a server-side API route and expects one structured analysis per message.

{
  id: string;
  customerName: string;
  companyName: string;
  receivedAt: string;
  subject: string;
  body: string;
}

The AI response is validated with a Zod schema. Each analysis must include:

{
  messageId: string;
  urgency: "low" | "medium" | "high";
  revenueRisk: "low" | "medium" | "high";
  summary: string;
  recommendedAction: string;
  evidence: string[];
  confidence: number;
  needsHumanReview: boolean;
  category:
    | "failed_payment"
    | "invoice_issue"
    | "plan_access"
    | "cancellation_risk"
    | "refund_request"
    | "enterprise_sales"
    | "product_bug"
    | "account_issue"
    | "product_feedback"
    | "other";
}

For v0, I kept the human-review rule simple. A message requires human review when the case involves high revenue risk, payment/access mismatch, refund policy, cancellation risk or low confidence. The AI can recommend a next step but it doesn't resolve billing, access or refund issues.

The subtle bug: valid output can still be wrong

Structured output validates shape but this isn't enough here.

Zod can tell me that the response contains a messageId, urgency, revenueRisk, summary and recommendedAction. It can't tell me whether the right analysis was attached to the right customer message.

This matters because the model returns an array. Even if every item in that array is valid, the relationship can still be wrong: the order can change, an ID can be duplicated, one message can be missing or an analysis can refer to a message that was never sent.

In a revenue-risk workflow, this is not a harmless bug. A failed-payment analysis could be attached to a product-feedback message. A paid-but-locked-out customer could be treated as low priority. The dangerous part is that the UI can still look fine.

This is why the app doesn't rely on array order. Each analysis must return the original messageId and the app maps analyses back to support requests by ID. The matching logic rejects duplicate IDs, missing analyses, count mismatches and unknown message IDs.

Making the AI output inspectable

The details side panel compares the original customer message with the AI review.

Support request detail panel showing a paid-but-locked customer message that needs review

The panel is where the reviewer checks the AI output against the original message. It shows the original customer request next to the AI summary, risk labels, evidence snippets and recommended next step. For cases marked as needing review, it also shows review actions.

Evidence is what keeps the label reviewable. In the paid-but-locked-out case, the reviewer can see the snippets that caused the escalation: "we paid invoice INV-2048", "the account is past due" and "our users are locked out." The reviewer doesn't have to trust the label blindly. They can compare the AI output against the customer's words before deciding what to do next.

The triage model

This is the original artifact for the project: the working revenue-risk triage matrix.

Message type Revenue risk Urgency Human review? Why
Failed renewal payment High High Yes Payment collection and account continuity are at risk.
Paid account locked out High High Yes The customer has paid but cannot use the product.
Invoice amount dispute Medium Medium Sometimes Billing confusion can delay payment or create support escalation.
Cancellation threat High High Yes The customer is explicitly signaling churn risk.
Refund request High Medium Yes The response can affect revenue and policy consistency.
Enterprise pricing request Medium Medium No Commercially important, but usually a routing case.
Product bug affecting paid work Medium Medium Sometimes Product impact may affect renewal confidence.
Login issue unrelated to billing Low High No Urgent for the user, but not necessarily revenue-risk.
Product feedback Low Low No Useful signal, but not an immediate support-risk case.
Mixed payment/access confusion High High Yes Billing and entitlement state may be inconsistent.

The matrix is simple on purpose. A more complex version could include contract value, customer tier, renewal date, account owner or payment history.

For v0, that would be too much. The goal was to prove the review pattern, not build a production support platform.

The test cases

The demo uses 10 realistic support messages.

Case Input signal Expected AI behavior Human review concern
Failed payment Card renewal failed, customer says card is valid High urgency, high revenue risk Do not assume the card is invalid.
Paid but locked out Bank transfer paid, account still past due High urgency, high revenue risk Verify payment before restoring access.
Invoice dispute Invoice amount changed after plan change Medium urgency, medium revenue risk Avoid promising a refund before review.
Cancellation threat Customer says they may cancel if issue continues High urgency, high revenue risk Escalate before churn.
Refund request Accidental annual renewal Medium urgency, high revenue risk Apply refund policy consistently.
Enterprise pricing 300-user expansion, SSO, audit logs Medium urgency, medium revenue risk Route to sales quickly.
Product bug Export button broken after filtering Medium urgency, low/medium revenue risk Determine whether paid workflow is blocked.
Login issue Password reset completed, still cannot log in High urgency, low revenue risk Do not confuse account access with billing access.
Product feedback Dashboard should remember selected date range Low urgency, low revenue risk Auto-triage as feedback.
Mixed payment/access confusion Payment, access and plan state are unclear High urgency, high revenue risk Needs human review because multiple categories overlap.

Designing the review workflow

The main interface is built for comparison.

A support reviewer doesn't only need to understand one message. They need to decide which messages require attention first. That makes the queue itself part of the product: urgency, revenue risk, category, triage status and received time need to be visible side by side.

This is why the main screen uses a table. In this workflow, density isn't a flaw. It is what lets the reviewer scan for the cases that might affect payment, access, renewal confidence or churn.

I also kept urgency separate from revenue risk because they can diverge. A login issue can be urgent without being a direct revenue-risk case. An enterprise pricing request may be commercially important without being an emergency. If both signals collapse into one generic priority label, the reviewer loses useful information.

The low-risk cases matter for the same reason. A triage system that marks everything as important is useless; it just creates more pressure without added benefit.

Support request detail panel showing a low-risk product feedback message

The interface isn't trying to make the AI output feel authoritative. It is trying to make the output comparable, inspectable and easy to challenge.

States and failure cases

The UI accounts for:

  • not-analyzed messages
  • active analysis
  • partially analyzed batches
  • auto-triaged messages
  • messages that need review

The validation layer accounts for:

  • malformed AI output
  • missing analyses
  • duplicated message IDs
  • count mismatches
  • unknown message IDs

Revenue Risk Inbox showing partially analyzed messages and not-analyzed rows

That matters because a review queue shouldn't assume the model response is complete, correctly ordered or safe to trust.

Where this leaves the project

This version doesn't include authentication, persistence, helpdesk integrations, billing mutations, assignment flows, audit logs, analytics or real customer data.

That was intentional since those features would make the demo larger but they wouldn't change the core interaction I wanted to test.

The next layer would be evaluation: expected vs. actual urgency, revenue risk and category; ambiguous cases; prompt regressions; and reviewer overrides.

That would move the project from "can the model produce plausible triage?" to "how do we know this triage behavior is reliable enough to put in front of a reviewer?"

Links

About the author

A senior software engineer working on complex product systems, with deep experience in frontend architecture, billing workflows, internal tools and AI-assisted interfaces.

← Back to home