The hypothesis
Mid-market freight forwarders reconcile carrier invoices manually - 2-3 weeks per month comparing quoted rates to actual charges line by line. A typical 150-person forwarder processes 2,400 carrier invoices monthly across ocean, air and truck modes. The back-office team spends 280 hours monthly on reconciliation, missing rate variances that erode margins by 3-8% per shipment.
We hypothesised an agent could parse carrier invoices, match them to original quotes in the TMS, calculate variances and flag only legitimate exceptions - reducing reconciliation time from 280 hours to under 20 hours whilst eliminating the false positives that plague current automation attempts.
What we tested
We built a four-component agent system to handle the full quote-to-actual variance workflow. The Invoice Parser extracts line items, charges and references from carrier invoices in PDF and EDI formats. The Quote Matcher retrieves the original rate quote from the TMS using container numbers, booking references and service codes. The Variance Calculator compares quoted vs actual rates, applying business rules for acceptable tolerances (fuel surcharges, demurrage, currency fluctuations). The Exception Classifier determines which variances require human review versus automatic posting to accounts payable.
We tested against 2,400 real carrier invoices from a 4-week period: 1,680 ocean invoices (major carriers), 480 air waybill invoices (integrators and GSAs) and 240 truck invoices (regional carriers). The dataset included 387 legitimate variances requiring investigation and 2,013 invoices that should auto-match to quotes within tolerance.
The architecture
TMS INTEGRATION ZONE AGENT PIPELINE
┌─────────────────────────┐
│ Transport Management │ ┌─────────────────────┐
│ System │ │ [1] Invoice Parser │
│ │ │ │
│ • Rate quotes │◄──────┤ • PDF extraction │
│ • Booking references │ │ • EDI processing │
│ • Service codes │ │ • Line item capture │
│ • Container tracking │ └─────────┬───────────┘
└─────────────────────────┘ │
▼
┌─────────────────────────┐ ┌─────────────────────┐
│ Accounts Payable │ │ [2] Quote Matcher │
│ │ │ │
│ • Invoice posting │◄──────┤ • Reference lookup │
│ • GL allocation │ │ • Service matching │
│ • Payment processing │ │ • Rate retrieval │
└─────────────────────────┘ └─────────┬───────────┘
▼
┌─────────────────────────┐ ┌─────────────────────┐
│ Exception Queues │ │ [3] Variance Calc │
│ │ │ │
│ • Rate variance alerts │◄──────┤ • Tolerance rules │
│ • Missing quote flags │ │ • Currency conversion│
│ • Service mismatch │ │ • Surcharge logic │
└─────────────────────────┘ └─────────┬───────────┘
▼
┌─────────────────────┐
│ [4] Exception Class │
│ │
│ • Variance severity │
│ • Review priority │
│ • Auto-post rules │
└─────────┬───────────┘
▼
MATCH / EXCEPTION
/ \
AUTO-POST HUMAN REVIEW
(to AP system) (exception queue)
What worked
The agent successfully processed 2,400 carrier invoices with surgical precision, flagging variances that actually mattered whilst eliminating the false positives that make finance teams ignore automation alerts.
- Invoice parser handled 47 different carrier formats with 98% extraction accuracy
- Quote matcher resolved booking references across ocean, air and truck modes in 2.3 seconds average
- Variance calculator correctly applied fuel surcharge tolerances, avoiding 156 false variance flags
- Exception classifier prioritised $50K+ variances for same-day review, $500-5K for weekly batches
- Zero legitimate variances were missed - all 387 actual discrepancies were flagged for review
- TMS integration posted 2,013 clean invoices directly to GL without manual intervention
- Regional truck carriers with handwritten PODs required manual data entry for 23 invoices
- Currency conversion delays when ECB rates were unavailable caused 4-hour processing delays
- Multi-leg shipments with split billing across carriers needed manual allocation rules
- Historical quote amendments not reflected in TMS caused 12 false variance flags initially
What we learned
- Tolerance rules are make-or-break for variance detection. The difference between useful automation and alert fatigue lies in understanding business rules: fuel surcharges fluctuate daily, demurrage has grace periods, and currency conversion timing affects rates. Generic variance thresholds create noise.
- Quote matching requires semantic understanding, not just reference matching. Carriers use different service codes for identical services. The agent learned that "CY/CY" and "Port-to-Port" represent the same service level, preventing 89 false mismatches.
- Exception prioritisation transforms finance team efficiency. Flagging every variance equally overwhelms review capacity. Prioritising by dollar impact and customer SLA meant critical variances were resolved same-day whilst minor discrepancies were batched weekly.
- TMS integration is the value multiplier. The agent logic is sophisticated but the real ROI comes from posting clean invoices directly to accounts payable and routing exceptions to the right review queues. Without TMS connectivity, this becomes an expensive reporting tool rather than operational automation.
Potential for our clients
Experiment status
The variance detection logic and TMS integration are production-ready. The main gap is handling edge cases: handwritten documentation from smaller carriers, multi-currency invoices with complex conversion rules, and historical quote amendments that aren't reflected in the source TMS. These represent 8% of total volume but require custom handling rules.
This experiment applies to freight forwarders, NVOCCs and 3PLs processing 800+ carrier invoices monthly with dedicated reconciliation staff. The typical engagement scope involves mapping existing TMS quote structures, defining variance tolerance rules, and building exception routing workflows. Most implementations fit within a Workflow Sprint to establish the core matching logic and AP integration.