Manual data entry is a “hidden tax” on every UK small business.
And right now, the stakes for this tax have never been higher.
Most businesses turn to automated extraction tools to solve this. But not all tools are created equal. While many systems claim 99% accuracy, they often fail on the “edge cases” that actually matter for your ledger.
Here are the 6 not-so-common limitations of invoice data extraction – and how to navigate them.
1. The "Multi-Page Anchor" Problem
Most basic extraction tools work on a “per-image” basis. They see one file and assume it is one document.
However, professional service invoices (like those from solicitors or contractors) often span three or four pages. If a system doesn’t have “Contextual Anchoring,” it treats each page as a separate invoice.
The Result: You end up with three different entries in your software for the same transaction. This creates “phantom expenses” that can lead to an overstatement of costs.
In 2025, research indicated that 41.4% of finance teams deal with up to 10 duplicate invoices every month due to processing errors.
Without a smart assistant that recognises page continuity, the audit trail breaks before it even begins.
2. The Mixed-Rate VAT Nightmare
The UK tax system is uniquely complex. A single invoice from a wholesaler might contain items at:
- 20% (Standard Rate)
- 5% (Reduced Rate)
- 0% (Zero-Rated)
Standard Optical Character Recognition (OCR) is “mathematically blind.” It looks for the largest number on the page and labels it as the “Total.”
The Limitation: If the system cannot break down the tax components line-by-line, the “Digital Link” required by HMRC is technically non-compliant. To meet MTD requirements, the record must show the tax breakdown correctly.
A “Smart Assistant” doesn’t just read the total; it performs a real-time mathematical check. It calculates:
Net + VAT = Gross
If the math doesn’t add up because of mixed rates, a basic tool will simply guess. A professional system will flag it for review.
3. The "Handwritten Nuance" (Field-Side Notes)
In industries like construction or hospitality, invoices rarely stay pristine. A site manager might write “Project Alpha” or “Paid via Petty Cash” across the top of a receipt.
Traditional extraction engines often treat handwriting as “noise” or “interference.” They either ignore it entirely or, worse, try to turn the handwriting into gibberish text that clutters the supplier name field.
However, those handwritten notes are often the most important piece of data for your accountant.
They provide the categorisation logic needed for correct nominal coding. If your extraction tool can’t distinguish between printed text and handwritten context, you are losing the “why” behind the spend.
4. Vendor Alias Complexity (Trading vs Legal Names)
A common point of failure is “Supplier Fragmentation.”
A business might receive an invoice from “The Red Lion,” but the bank payment is made to “JD Wetherspoon PLC.” Or an invoice says “Amazon,” but the VAT number belongs to “Amazon EU Sarl.”
The Limitation: Basic tools create a new supplier in your accounting software every time they see a slight variation in the name. This results in a “dirty” ledger with five different accounts for the same vendor.
Data from the ACCA suggests that streamlining these processes can boost labour productivity by 3% in finance-heavy sectors. A smart system uses the VAT Registration Number as the “Source of Truth,” cross-referencing it with HMRC data to ensure the vendor is mapped to the correct legal entity every time.
5. Line Item "Hallucinations" (Quantity vs Unit Price)
Extracting the “Header” data (Total, Date, Supplier) is relatively easy. Extracting “Line Items” is where most systems fail.
Consider an invoice that lists:
5x Widget A @ £10.00
A basic extraction tool might confuse the “5” (Quantity) with the “10.00” (Unit Price) or even the “Total” (£50.00).
In complex invoices with dozens of lines, these “hallucinations” create a reconciliation nightmare.
The Risk: For businesses using three-way matching (Invoice vs Purchase Order vs Goods Receipt), even a single-digit error in line item extraction stops the entire payment workflow.
Also Read: Most Used MTD Software By Landlords
6. The "Compliance Link" Fragility
Under the April 2026 MTD rules, the “Digital Link” must be unbroken.
If you extract data into a CSV file, manually edit that file in Excel to fix an error, and then upload it to your accounting software, you have broken the digital link.
Most extraction tools require this “manual middle step” because their initial accuracy isn’t high enough to trust.
True automation requires a system that allows you to verify and “fix” the data inside the secure environment before it is pushed to the ledger.
This ensures that the transition from the physical receipt to the digital record is 100% traceable for an HMRC inspector.
The Solution: Moving from "Tools" to "Assistants"
The limitations listed above are why “basic” scanning apps are being replaced by Intelligent Bookkeeping Assistants, especially like EazyCapture.
EazyCapture isn’t just a scanner. It is a category-leading assistant designed to bridge the gap between “what’s on the paper” and “what’s in the ledger.”
- Verified Identity: Just as our Video Caller ID ensures you know exactly who is calling before you pick up, EazyCapture’s vendor verification ensures you know exactly which supplier you are paying.
- Contextual Intelligence: It handles mixed VAT rates, multi-page documents, and handwritten notes with ease.
- HMRC Ready: Built for the 2026 MTD mandate, EazyCapture maintains a perfect digital link from the moment you snap the photo to the moment it hits your software.
Stop paying the “manual labour tax.”
Join the thousands of UK businesses and landlords using the smartest assistant on the market.
Try EazyCapture today.



