Processing Identity Documents in Laravel

Processing Identity Documents in Laravel

If you work in any kind of business that require you to verify a users identity, you might have noticed that most solutions can be quite pricey. In this blog post I'll go over the solution we came up with that makes this process completely free (depending on your configuration) and integrated in your own application.

I'll also go over how some parts of it work, how you can use this solution very easily to handle document processing for your use cases and a cost comparison.

If you however wish to skip some explanation and go straight to implementation, take a look at the package here, which has a readme that should be sufficient to get you started.

A Short Introduction to Identity Documents

What helps us most in processing documents is that most countries in the world issue Machine Readable Travel Documents (MRTD) that follow standards set by the ICAO. If you wish to read more about the specifications you can do so here, but what is most important for us to know is that every MRTD contains a Machine Readable Zone (MRZ). The format of the MRZ depends on the document's size of which there are five. What all of the MRZ have in common is that they are specifically designed to provide what we most need, a part of the document that is easily readable for machines (alpha A-Z, numeric 0-9 and filler < only) and contains the most important information about the document. A few things that the MRZ will include are:

  • Document holder's name
  • Document holder's nationality
  • Document holder's date of birth
  • Document holder's sex
  • Document holder's personal number (TD3 only)
  • Document number
  • Issuing country
  • Document expiry date

Another important part of the MRZ is the existence of check digits. These digits exist to mathematically verify the information read from the MRZ. You can read more about how the check digits work here (page 20), but for us it's just important to know that they exist and that we can verify data parsed from the MRZ.

Finding and Parsing the MRZ

I'll start off by saying that there are multiple ways to approach the issue of extracting information from an image reliably. However, my main focus was to make this solution both easy to use and flexible. This rules out any machine vision that would specifically detect the MRZ through AI. Instead, I decided to create an algorithm that can detect the MRZ in text. This means that you can use any OCR API or solution that you prefer (more about that later), and simply let the package do the rest of the work.

So then how can we detect an MRZ in text? Let's start with what we know about MRZs. First of all, MRZ start with a document key. This key can be either P, I, A, C, or V, and indicate the type of document, where P is a passport, V is a VISA and the remaining keys indicate an official travel document (usually identity cards). Finding one of these keys will give us a starting point to find the complete MRZ.

Once we've found a bunch of (possible) document keys in the complete text, we'll be able to iterate over them to see if they are indeed the start of the MRZ. How we do this is by using the next important thing we know about every MRZ: check digits. For every document (end therefor MRZ) format we know the relative positions of these digits from the document key. For example a passport (TD3 format), has digits on position 53, 63, 71, 86 and 87. To make our list of possible document keys shorter quickly, we'll first check if the characters in those positions are actually digits (or the letter O due to common OCR error). If these are indeed all digits, the document key passes this first rudimentary test, and it will progress to a more thorough and robust test.

The next thing we'll do it the actual mathematical test for each check digit. We once again know the check digit's position, but we also know the position of the substring this digit has to check over. If we take the characters that fit these positions relative to the key, we can then use the check algorithm to verify the check digit. If all check digits pass, we can confidently assume we have indeed found the correct document key. After this, all that's left is creating a substring from our full OCR text using the found position and the known MRZ length (varies per format).

Choosing Your OCR Service

Having a MRZ finding algorithm that works well on simply any text gives you the huge advantage of being able to use pretty much any OCR service you wish. This means that you'll be able to use an external API like Google Vision, or run your own OCR solution through something like Tesseract. Keeping this flexibility in mind, the package includes a service class for the Google Vision API, but also commands to create your own ServiceClass. I've personally tested this with tesseract running locally on my PC, and although the results were not as accurate as the Google API (due to my lack of knowledge about tesseract), it did show me how easily I could set up any alternative service.

More information on how to set up your own custom service can be found in the readme, but to demonstrate how easy it can be, let's go through building a tesseract service together.

Building a Tesseract OCR Service

The first thing we need is obviously our main package, which we'll install using

$ composer require werk365/identitydocuments

After that, let's create a new OCR service

$ php artisan id:service Tesseract OCR

This will create a new Tesseract class in your App\Services namespace, which will look like this:

<?php

namespace App\Services;

use Intervention\Image\Image;
use Werk365\IdentityDocuments\Interfaces\OCR;
use Werk365\IdentityDocuments\Responses\OcrResponse;

class Tesseract implements OCR
{
    public function ocr(Image $image): OcrResponse
    {
        // TODO: Add OCR and return text
        return new OcrResponse('Response text');
    }
}

The next step is to use Tesseract in our PHP class, to do that we'll use this excellent package

$ composer require thiagoalessio/tesseract_ocr

Using this package, we'll be able to get an OCR result easily. In the following example we'll first save the image to the local temp storage folder, and then parse the image using Tesseract, after which we return the result

<?php

namespace App\Services;

use Intervention\Image\Image;
use thiagoalessio\TesseractOCR\TesseractOCR;
use Werk365\IdentityDocuments\Interfaces\OCR;
use Werk365\IdentityDocuments\Responses\OcrResponse;

class Tesseract implements OCR
{
    public function ocr(Image $image): OcrResponse
    {
        // Store your image in a temp folder
        $imagePath = sys_get_temp_dir().md5(microtime().random_bytes(5)).'.jpg';
        $image->save($imagePath);

        // Use Tesseract to create text response
        $response = (new TesseractOCR($imagePath))->run();

        // Return the new response
        return new OcrResponse($response);
    }
}

After doing this, you'll be able to load this service class and use it in the package as described by the readme. Although this is just an example and has not been tested enough at the moment, using Tesseract gives you the benefit of both being able to keep all documents within your own ecosystem (might be important due to regulations) and it being free to use.

Cost comparison

As cost was one of the motivators for me to write this package, I feel like it's good to do a quick comparison. All prices are excluding server cost as that depends on your own infrastructure.

In this list I'll just go over some of that APIs and services you can use with this package, and not over other external Identity Document Verification services, as their prices and functionalities differ greatly. It goes without saying though, that running your own solution will be much cheaper in most cases.

OCR ServicePrice
TesseractFree
Google VisionFree for the first 1000, after that starting from $1.5 per 1000
Amazon TextractFree for the first 1000, after that starting from $1.5 per 1000 (depends on region)
Azure Computer VisionFree for the first 5,000 (rate limited), after that starting from $1 per 1000

If you're currently using or researching another solution, you'll see that this can make a big difference. What should be noted is that although the package supports an option that merges your front and backside images of the documents (for cost saving), by default the OCR will be done in two separate calls.

Getting started

If you just want to get started using this package and you have no preference on what API to use, using the Google Vision API will be the easiest as the package comes pre-configured for it. This way you can also use the Face Detection functionality that is supported by the build in Google Service (although you can create a custom service for this too of course, in a similar manner as the OCR service).

Installing the package and getting it running should be quite painless, and a walkthrough can be found in the readme of the package: