Objective | Provide OCR abilities in Bahmni | ||||||||||||
Due date | |||||||||||||
Key outcomes | Phase1. OutcomeA: Ability to scan Covid RT-PCR test results into Bahmni via OCR. | ||||||||||||
Status |
| ||||||||||||
Collaborators | KCDH/IIT + Thoughtworks | ||||||||||||
Slack | #bahmni-ocr | ||||||||||||
Code Repo | |||||||||||||
Issue List |
Table of Contents | ||||||
---|---|---|---|---|---|---|
|
Problem Statement
Provide OCR abilities in Bahmni
...
Lab Reports (hand written)
Prescriptions (printed / hand-written)
Consultation Notes (printed / hand-written)
Discharge Summary (printed / hand-written)
Payment Receipts / Insurance Claim Documents
Current POC Status (OCR for Covid Lab Reports)
Code: https://github.com/document-analysis-tools/ocr-ner-extractor
Mark regions to extract from Lab Reports. (Using Opensource Label Studio)
From the regions, extract the text (OCR of printed text) using Tesseract models.
Use NLP libraries like MedCat and Spacy for extraction of "meaning" from text (like identifying patient ID, name or clinical term).
Receive a JSON representation of original Lab report, with appropriate data elements extracted and identified.
Reference materials
Digital Scanned Documents for Bahmni - Initial Proposal by KCDH: (Presentation Link)
IIT/KCDH: https://rnd.iitb.ac.in/research-glimpse/adaptive-framework-end-end-corrections-indic-ocr
Sample Lab Reports
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|
View file | ||
---|---|---|
|