DocumentAI: Intelligent Document Processing for Epicor P21 ERP

Multi-Document Support

POs, invoices, sales orders, and quotes across diverse vendor formats

Template-Free AI Extraction

An AI agentic system interprets fields, context, and structure without fixed templates

Self-Improving System

Every processed document strengthens the platform's accuracy and automation

The Challenge: Manual Processing at Scale

Operating on the Epicor Prophet 21 (P21) ERP, every vendor purchase order, invoice, and trade document had to be accurately entered into P21 to trigger fulfillment, shipping, and billing. The order entry team manually opened PDFs, identified fields across dozens of unique vendor layouts, mentally translated vendor terminology into P21 field requirements, and re-entered each value into ERP screens—document after document, every day.

Format Diversity Across Vendors

Identical business concepts appeared under different labels—“Ship To,” “Delivery Address,” or “Consignee.” Part numbers, quantities, and pricing were presented inconsistently, sometimes in clearly defined columns, other times embedded in free-text descriptions. Traditional template-based OCR proved unviable since each vendor required a dedicated template that frequently broke when formats changed.

Error-Prone Manual Entry

Manual transcription introduced errors at every stage—incorrect part numbers, misread quantities, wrong unit-of-measure interpretations, and incorrect ship-to selections. These errors propagated downstream into fulfillment and shipping, leading to mis-picks, incorrect shipments, delivery errors, and costly rework.

Throughput Bottleneck

Each document required 10–20 minutes of skilled labor to process. During peak periods, backlogs grew rapidly and delayed fulfillment. Capacity scaled strictly with staffing—no mechanism existed for automation, efficiency gains, or differentiation between routine and exception transactions.

Knowledge Concentration Risk

Experienced staff accumulated deep institutional knowledge about vendor-specific formats, terminology quirks, and edge cases. This knowledge existed only in individuals' heads. Staff absences or turnover led to slower processing and higher error rates, while new hires required extensive ramp-up time.

The Solution: Four Architectural Pillars

DocumentAI was designed around four core principles, emphasizing practical automation over theoretical completeness—maximizing value for high-volume, repetitive scenarios while providing intelligent assistance for exceptions and edge cases.

1. AI-First Extraction (No Templates)

Each PDF is processed using database-configurable extraction prompts, allowing the AI agentic system to interpret document layout, field labels, tables, and line items contextually—without fixed zones or vendor-specific templates. Extraction prompts and vendor-specific hints are fully configurable through the admin interface, enabling rapid support for new formats without code changes.

2. Two-Tier Field Mapping

Tier 1 captures raw vendor fields as extracted (e.g., “Ship To,” “PO #,” “Item No.”). Tier 2 defines the P21 field schema with ERP field requirements, data types, and validation rules. A dedicated Field Mappings table bridges these tiers, storing customer-specific mapping rules keyed by Bill-To customer.

3. Customer-Aware Learning

Returning customers benefit from fully automated field mappings with no human intervention. New customers trigger a smart suggestion engine that uses string similarity algorithms and usage frequency analysis to propose likely P21 mappings with confidence scores. Every human confirmation or correction expands the learning base.

4. Database-Driven Configuration

All system behavior—extraction prompts, vendor hints, P21 field schemas, lookup rules, and mapping definitions—is driven by database-stored configuration. Business users adjust system behavior through the admin interface without developer involvement, enabling rapid adaptation to evolving needs.

Implementation: End-to-End Pipeline

Built on the latest Microsoft technology stack and modern cloud services, DocumentAI follows a modular architecture with clear responsibilities and well-defined interfaces between layers.

Multi-Channel Document Intake

A configurable intake system ingests documents from Google Drive (OAuth 2.0 with automatic token refresh), Azure Blob Storage (blob-triggered processing), and email. Each source is independently configurable through the admin UI, with per-source statistics for document volumes, success rates, error counts, and last poll timestamps. Post-ingestion, documents are archived or removed to prevent duplicate processing.

Azure Durable Functions Processing Engine

The pipeline is built on Azure Durable Functions for long-running, fault-tolerant, and scalable workflows. A blob trigger initiates a durable orchestration that coordinates five key activities: AI extraction, field mapping, P21 validation, confidence-based status determination, and automatic P21 order creation. A dedicated queue and scheduler provide ordered execution, state tracking, and visibility into queue depth and processing times.

AI Vision Extraction Service

The core intelligence layer handles PDF-to-image conversion, prompt retrieval, vendor hint injection, AI agentic-system interaction, structured response parsing, and detailed logging. It supports complex multi-page documents with line items spanning page breaks and handles nested tables, merged cells, and multi-column layouts using AI-based contextual understanding rather than positional rules.

P21 Validation & Lookup Engine

Configurable rules stored in the P21LookupConfig table define entity resolution logic for customers, ship-to locations, and items. The engine supports cascading dependencies, dynamic filtering, configurable execution order, and multiple-match handling. Single matches are applied automatically; multiple matches are presented for user selection; unresolved mandatory fields trigger human review with clear field-level indicators.

Conversational AI Assistant Interface

Instead of traditional form-based data entry, users interact through a chat-style interface. A side-by-side layout displays the original PDF alongside an interactive conversation where users review extracted fields, accept or modify mapping suggestions, trigger P21 validation, and create orders. The assistant recognizes intents like “show the order” or “what fields are missing” and responds with context-aware information.

Results: Measurable Operational Impact

Zero Template Dependency

New vendor formats are handled on first encounter. Format changes require no reconfiguration. Extraction accuracy improves over time through prompt refinement managed entirely via the admin UI.

Dramatic Reduction in Manual Entry

Returning customers—representing the majority of transaction volume—benefit from fully automated mappings. Staff focus shifts from routine transcription to exception handling.

Accelerated Processing Cycle

Documents begin processing within minutes rather than hours or days. High-confidence documents progress from intake to P21 order creation automatically, even during peak periods.

Improved Data Accuracy

AI-driven extraction combined with P21 validation catches entity resolution issues before order creation, preventing downstream correction cycles and reducing rework.

Compounding System Accuracy

Each processed document strengthens the cross-customer learning model. Mapping suggestions improve as historical patterns grow, resulting in fewer human corrections over time.

Business User Empowerment

All key configurations are managed through the admin UI. Business users respond to new vendors, format changes, and evolving requirements without code deployments or developer involvement.

Lessons Learned

Decouple Extraction from ERP Integration Early

The two-tier mapping architecture proved to be one of the most consequential design decisions. By separating how vendors format documents from how the ERP expects data, the system avoids tight coupling that would make every format change a development task. New document types were added by defining new schema entries and mapping rules—without changes to the extraction layer. This architecture also positions the platform for potential integration with ERP systems beyond P21.

Let AI Handle Format Variation; Let Humans Handle Exceptions

Attempting to codify every vendor format into rules, templates, or positional logic is a fundamentally losing strategy. AI-powered extraction handles format diversity naturally using the same contextual understanding a human reader applies. The confidence-based routing system ensures human expertise is applied precisely where it adds the most value—not on every document, but only on uncertain ones.

Database-Driven Configuration is Non-Negotiable

Systems that embed AI prompts, mapping rules, or configuration in source code create an unsustainable dependency on developers for routine adjustments. When a vendor's documents proved difficult, an operations team member added vendor-specific hints through the UI and saw immediate improvement—without filing a dev ticket, waiting for code changes, or scheduling a deployment.

Invest in Observability from Day One

The queue monitor, processing history dashboard, extraction logs, and per-source intake statistics were developed alongside the core pipeline—not as afterthoughts. Observability was critical during initial deployment for diagnosing issues, builds stakeholder confidence by making the system transparent, and enables continuous improvement through trend monitoring.

Design for Learning, Not Just Processing

A static system delivers the same level of automation on day one as on day one thousand. A learning system delivers increasing value over time—every human correction improves future suggestions, every approved mapping becomes reusable, and every new customer enriches the global learning pool. The operational cost per document decreases as the system accumulates knowledge, creating compounding efficiency gains.

The Bottom Line

DocumentAI transformed a labor-intensive, error-prone manual workflow into a largely automated, self-improving system. The platform is fully operational today, processing documents across multiple intake channels with a growing library of customer-specific learned mappings.

The organization has significantly reduced manual effort, improved data accuracy, and positioned itself to scale document processing capacity without proportional increases in staffing—delivering measurable operational and business impact.

For industrial distributors running on Epicor P21 (or any ERP platform), the lesson is clear: AI-powered document processing is no longer experimental. With the right architecture—template-free extraction, decoupled mapping, customer-aware learning, and database-driven configuration—organizations can eliminate manual document entry as an operational bottleneck and turn it into a competitive advantage.

DocumentAI: How We Built an Intelligent Document Processing Platform for Epicor P21