Looking for 1 invoice-classification workflow with an explicit expected label

I’ve already tested 3 materially different workflow lines in a narrow Phase 1 setup.

What I have so far:

  • voucher validation
  • invoice extraction / strict schema validation
  • support email classification / routing

That helped me observe boundary behavior, but the next thing I want is one case that gets closer to semantic correctness.

So I’m looking specifically for 1 invoice-document classification workflow where the expected label is already known upfront.

A strong fit would be something like:

  • Invoice
  • PurchaseOrder
  • CreditMemo

What I want to test next is the separation between:

  • boundary safety = did the output stay within the allowed class set?
  • semantic correctness = did it match the expected label for this specific case?
  • downstream risk = what breaks if the label is wrong?

A short outline is enough first.
I do not need a full production payload immediately.

What helps:

  • rough payload shape
  • target schema
  • label set
  • one simple example with an explicit expected label
  • optionally one mixed / ambiguous example
  • a short note on the business rule for ambiguous handling
  • polling or webhook preference

What I can return:

  • whether the run ended in succeeded or failed_safe
  • a short reason if relevant
  • a receipt reference

This is still narrow Phase 1 work, not broad onboarding.

Public kit:

If you have a case like this, reply here or DM me.

good thinking on the boundary-vs-semantic split, most people skip that. the one thing id be curious about is how badly it fails on edge cases like purchase orders that look like invoices. if you test with real docs, let me know what the accuracy patterns are.