Technologies Involved:
OCR
Area Of Work: Machine Learning
Project Description

A US-based automation solutions provider with expertise in cross-platform scripting engaged Oodles to streamline its document recognition and screen automation workflows. The client required an OCR-enabled system that could handle browser-based UI interactions and desktop automation through visual input, integrated with headless execution environments.

Scope Of Work

The client sought Oodles for a robust Java-based framework that could recognize on-screen data using OCR and perform GUI automation across platforms. The project involved automating form interactions, enabling screen scraping, and integrating cross-platform execution support using Sikuli, Tesseract, XVFB, and related technologies.

Our Solution

To address the client’s goals, a comprehensive automation suite was developed using a combination of OCR, computer vision, and screen automation tools.

Key highlights include:

  • Visual UI Recognition & Automation: Implemented using Sikuli to simulate user interactions with screen elements. 
  • OCR for Document & Screen Extraction: Integrated Tesseract OCR to extract structured data from identity documents. 
  • Preprocessing for Accuracy: Used OpenCV to crop, deskew, and enhance scanned document images before processing. 
  • Cross-Platform Execution Support: Enabled headless automation using XVFB for Linux and for Windows environments. 
  • Modular Codebase in Java: Built with Spring Framework, offering high maintainability. 
  • Data Management: Utilized MongoDB to store and retrieve OCR results in a structured, scalable format.
aiShare Your Requirements