Technologies Involved:
APIs
AWS MediaConvert
Web Apps
Project Description

Ohiogastro OCR is the leading provider of advanced gastrointestinal (GI) procedures in Central Ohio, offering state-of-the-art facilities and a team of highly skilled specialists. With a commitment to excellence, they perform a high volume of GI procedures each year, surpassing any other practice in the area. Their dedicated team ensures that patients receive top-quality care, utilizing the latest techniques and technologies to deliver accurate diagnoses and effective treatments. Whether it's routine screenings or complex procedures, their focus is on providing the best possible outcomes and improving the overall gastrointestinal health of their patients.
 

Scope Of Work

Our scope of work for the Ohiogastro OCR project is to develop a solution that extracts text using OCR (Optical Character Recognition) from medical test reports and converts it into HL7 format. Our goal is to streamline the process of digitizing and organizing medical data, specifically focusing on test reports, to enhance accessibility and interoperability. Through effective collaboration and expertise in OCR technologies, we aim to deliver a reliable and efficient system that accurately extracts text from medical reports, transforms it into HL7 format, and integrates seamlessly with existing healthcare systems. By automating this process, we strive to improve efficiency, reduce manual data entry errors, and enhance the overall healthcare management experience for Ohiogastro OCR and its patients.
 

Our Solution

Through effective collaboration and meticulous attention to detail, we delivered a comprehensive OCR solution that efficiently extracts text from medical test reports and converts it into HL7 format. Our solution for the Ohiogastro OCR project includes the following services and technologies:

  • Project and Database Setup to ensure a solid foundation for the development and implementation of the OCR solution.
  • API Authentication using bearer tokens to ensure authorized access to sensitive medical data.
  • Data Gathering from the faxed server, ensuring accurate and reliable data retrieval.
  • PDF to Image Conversion of files containing medical test reports into image files (JPG) for further processing and analysis.
  • Image Preprocessing to remove noise and enhance the clarity of the text, improving the accuracy of OCR results.
  • YOLO Model to recognize the sections of the documents that contain tables, enabling targeted text extraction using the Pytesseract library.
  • OCR (Optical Character Recognition) extracts the text content.
  • Dataset Preparation and Annotation for training a custom Named Entity Recognition (NER) model.

 

 

 

 

Related Projects