Highly variable handwriting
Different sales reps write with different styles, pressure, angles, cross-line notes, corrections, signatures, and overwritten fields. Traditional OCR often fails at the full-row level.
This is one entry from Wavesteam Technology's AI solution library. For handwritten order recognition, we benchmarked general OCR APIs, multimodal large models, and a domain-specific OCR model, then delivered a hybrid production pipeline. On messy handwriting, folded paper, cross-line corrections, and overlapping fields, field-level accuracy improved from 68.4% to 96.1%.
The customer is a fast-moving consumer goods distributor serving hundreds of retail outlets. Orders arrive as handwritten paper forms and must be entered into ERP for fulfillment and reconciliation. The actual paper conditions are far messier than a clean demo sample.
Different sales reps write with different styles, pressure, angles, cross-line notes, corrections, signatures, and overwritten fields. Traditional OCR often fails at the full-row level.
Quantity, unit price, product name, and remarks are often mixed together. Product shorthand must be normalized against business vocabulary and SKU context.
The team had to process 1,200–1,800 orders per day within a narrow window. Two operators were working overtime and still produced avoidable entry errors.
A wrong quantity, price, or customer name affects fulfillment, reconciliation, and month-end settlement. The impact is not inconvenience; it is real financial exposure.
The sample below shows a desensitized handwritten order from a real workflow. We place the original image, visual recognition overlay, and final structured JSON side by side so stakeholders can see what the AI actually does.


Creases shift field positions, but all 12 rows are correctly aligned.
Crossed-out prices and handwritten replacements are interpreted with an audit trail.
Remarks overlapping the price column are reassigned through structured post-processing.
For production AI, we do not choose a model first. We benchmark practical options against real samples, then design a hybrid pipeline where each model handles the part it is best suited for.
The delivered system uses a domain OCR primary path, a multimodal semantic correction path, and cloud OCR fallback. Most routine orders finish in under one second. Low-confidence fields are routed to the multimodal model, and poor-quality samples or service failures fall back automatically with manual-review flags.
This structure balances accuracy, cost, latency, and controllability. The engineering principle is simple: use software architecture to turn model uncertainty into business certainty.
We treat AI as a pipeline, not a black box. Every step has a defined responsibility, input, output, and fallback strategy.
Images enter through mobile, scanners, or forms. Distortion, shadow, white balance, and layout are normalized before recognition.
A layout-aware detector identifies headers, rows, columns, and field roles before recognition.
GS-OCR-Hand v2 is fine-tuned on real handwritten samples. Low-confidence fields are routed forward for semantic review.
For corrections, cross-line notes, and context-heavy fields, a multimodal model reads against SKU dictionaries and historical order context.
The output is normalized with unit conversion, price-range checks, customer matching, total checks, and audit logs.
Five business channels with idempotency, rate limits, and desensitization.
Three inference paths plus a confidence-aware router.
Connects OCR output to ERP, reconciliation, and review workflows.
Online corrections flow back into datasets so the model improves monthly.
We deliver AI systems as accepted, measurable engineering projects. Before launch, the work is milestone-based; after launch, the data loop keeps improving the model.
Walk through the real order flow with business and IT stakeholders.
Collect 4,600 real forms and build the first training and evaluation sets.
Run cloud OCR, multimodal reading, and custom OCR on the same samples.
Build confidence routing, semantic correction, fallback, and pressure tests.
Run AI and human entry in parallel at one warehouse for reconciliation.
Roll out to six warehouses and sign off against KPI targets.
Online errors flow back into the sample store for incremental improvement.
Instead of vague claims, we use same-sample before-and-after metrics and customer feedback to show whether the system solved the real problem.
“Month-end reconciliation used to be our biggest headache. Since this OCR system went live, orders from six warehouses are basically scanned, structured in seconds, and written into ERP. More importantly, the models and data stay in our private cloud.
ITIT Director · FMCG distributor in South China
The hybrid OCR + multimodal correction + fallback pattern can be reused for forms, tickets, handwritten logs, and business documents where structure matters.
Stock replenishment forms can go directly into inventory systems.
Sales teams can scan slips into monthly ledgers.
Robust handling for outdoor stains, folds, and handwriting.
Sensitive units and dosage fields can be checked against dictionaries.
Signatures and remarks can be separated from operational fields.
Supports local deployment and end-to-end audit trails.
We are a software engineering and AI implementation team. Over the past three years, we have moved more than 20 AI scenarios from promising demo to stable production operation.

Related solutions
AI, capital-markets docs, OCR, vision, IoT and membership operations — composable for your industry.
Edge inference and multimodal models for face, behavior, and vehicle recognition — 99.7% accuracy, sub-50ms latency, deployed 24/7 across cities, plants, and campuses.
A custom inventory and procurement platform for multi-plant manufacturers — AI demand forecasting, automated replenishment, and 42% higher inventory turnover in 14 months.
Field-deployable battery-swap cabinets with cloud BMS — 2-minute swaps, 12 bays per cabinet, 3× the daily area covered, and 8-month payback for ag-drone operators.
We unpack the workflow with you, judge whether AI is worth using and which approach makes the most sense, then come back within 5 business days with a practical initial plan and estimate.