Actual Capture Paper Receipts

Overview

Actual Capture transforms photos of paper receipts into clean, structured purchase data in real time.

Using machine learning and OCR models, the system extracts, the system extracts:

  • Merchant details
  • Itemized products
  • Subtotals, tax, and totals
  • Payment method
  • Date, time, and transaction metadata

The result is reliable, production-ready data that can power rewards, loyalty, validation, analytics, and monetization use cases.

Flexible Integration Options

Receipt capture is available through:

  • BlinkReceipt SDK — on-device processing for iOS and Android
  • Scan API — server-side processing for any platform

Both options use the same underlying extraction models and return results in real time. Whether you prefer client-side capture or server-side ingestion, the output structure and quality signals remain consistent

How It Works

Receipt processing follows a multi-stage pipeline, whether the image originates from the SDK camera or the API:

1. Image capture

Receipts are captured via the SDK’s built-in camera interface, imported from the device photo gallery, or submitted to the API as an image file or publicly accessible URL.

2. Extraction engine

Language-specific OCR models (built internally for each supported language) extract raw text from the receipt image.

3. Data structuring

Extracted text is organized into:

  • Trip-level fields: merchant, date, total, tax, payment method
  • Basket-level fields: product descriptions, quantities, prices, discounts, unit of measurement

4. Merchant identification

The system uses machine learning models, phone number and address lookups, string matching, tax identification numbers, and other signals to identify the merchant. In the U.S., results are also cross-referenced against a database of more than 250,000 store locations.

5. Quality and fraud checks

Dedicated detection models assess:

  • Whether the image is a valid receipt
  • Whether it is blurry
  • Whether it was captured from a digital screen
  • Whether the receipt appears fraudulent or has been previously submitted

6. Enrichment

Actual Enrichment matches extracted product data against a unified product catalog to return standardized UPCs, brand names, categories, and product attributes. Enrichment is included by default for all processed receipts in the US. See Understanding Actual Enrichment for details.

Integration Options: SDK and API

Two integration paths are available. Both run the same extraction models and return results in real time.

BlinkReceipt SDKScan API
PlatformsiOS, AndroidAny platform (REST API)
ProcessingOn-deviceServer-side
Camera UIBuilt-in camera interface with real-time feedbackNot included — you provide the image
Long receiptsMulti-frame capture with guided alignmentMulti-frame submission via sequential API calls
Best forConsumer apps, loyalty platforms, promotionsBackend systems, web apps, batch processing, high-volume workflows

There are no differences in commercial terms whether you use the SDK, API, or both. Both are updated monthly.

Receipt Quality and Validation

The system returns properties for assessing image quality and receipt validity. These can drive user-facing feedback or backend business rules.

PropertyWhat it tells you
is_receiptML model trained on physical till-slip receipts determines whether the image is a valid receipt
is_blurryImage is too blurry for reliable extraction
is_screenImage appears to be a photo of a digital screen (e.g., computer-generated receipt)
is_fraudulentReceipt appears manipulated or counterfeit (U.S. only, select banners)
is_duplicateReceipt matches a previously submitted transaction (2-week window, scoped to your instance)

What counts as a receipt: The system is trained on till-slip receipts from retail merchants. Larger documents such as A4-sized invoices or hotel bills are not supported.

Fraud detection: Fraudulent receipts fall into two categories: manipulated (altered content) and counterfeit (fabricated). This signal is designed as one input into a broader fraud strategy. We recommend layering it with user-level rules such as spend thresholds and submission frequency.

Duplicate detection: Uses proprietary algorithms that take into account trip and basket data to determine whether a receipt has been seen before.

Supported Markets and Languages

Markets: United States, Canada, Mexico, Brazil, United Kingdom, Germany, Spain, Poland, Sweden, Australia, France, Belgium.

Languages: Custom OCR models are built for English, Spanish, French, German, Polish, Portuguese, and Swedish. Only Latin characters are currently supported.

Launching in a new market:

  • Non-English speaking markets typically take 60–90 days to launch. This includes building a product catalog for that market, creating a ground truth regression set, and thorough testing of text processing logic.
  • Unsupported English-speaking markets take less than 30 days.

Merchant detection is generic. This allows it to scale across markets without per-merchant configuration.

Accuracy and Performance

Processing speed: Both SDK and API return results in real time — anywhere from less than a second to 4–5 seconds, depending on network conditions.

Confidence properties are included in every response. See Understanding Confidence & Accuracy for interpretation guidance.

Privacy and Security

  • SDK processing runs entirely on-device.
  • API processing runs on AWS. U.S. data is stored in Virginia; European data is stored in the EU in compliance with GDPR.
  • Transport security: HTTPS with TLS, client-side certificate pinning, and symmetric encryption of credentials.

Related Content

Let’s Turn Proof Into Action 

Your next move should be backed by proof, and we’re here to help you leverage real data for real results. Start turning verified insights into measurable impact today.