Bahmni reporting considerations
Current issues with reporting from the transactional database
- [OpenMRS] Too normalised, leading to too many joins, hard to write performant queries, having to tune the tables for reporting purpose may not be ideal for online system
- [OpenMRS] Non-aggregate in nature
- [OpenMRS] Hard to convert row based (map like) data to column like reports
- [OpenMRS] Not suited for analysis and ad-hoc querying
- Cannot perform join of data across systems
Tool Options
Kibana (with ES)
. Scales horizontally to multiple servers (but this is not necessarily a must have because most hospital’s data can fit on a single hard disk)
+ Simple to use
- Is in its infancy feature wise, for example doesn’t even support nested objects, lists. Probably skunkworks (my assessment). Project not very active https://github.com/elasticsearch/kibana.
Jasper
. JasperServer is not very pretty, but embedded jasper provides lot more flexibility
+ Very mature and has all features we need for reporting
. Supports multiple databases (we don’t really have a need for it though)
- Has only paid version for analysis.
- Filtering based on condition has to be done via MDX
Saiku (analytics-lab)
+ Very active project
+ Best open source analytics solution for a relational database (star schema)
. Not meant for reporting
+ Provides stats
DHIS 2
- Too generic for us.
- Not designed for our kind of use case
(let me know if you want to know more details)
Star Schema
Dimensions (not a complete list)
Location
Date (+ Week, Month, Year)
Provider
Facts (also called measures; not a complete list)
Patients Diagnosed
Drug Orders Given
Accessions
Lab Results
Reference
- Simple introduction => http://ciobriefings.com/Publications/WhitePapers/DesigningtheStarSchemaDatabase/tabid/101/Default.aspx
- http://www.amazon.com/Data-Warehouse-Toolkit-Definitive-Dimensional/dp/1118530802/ref=sr_1_1?s=books&ie=UTF8&qid=1405491662&sr=1-1 (I have this book, if you want to borrow)
- Jasper editions comparison https://www.jaspersoft.com/editions
Aggregated database or de-normalized patient level database?
(applies only if we go with relational database)
- Aggregated database wouldn’t allow one to drill down to a patient
- Non aggregated star schema with patient information are large in size (but even with estimates on high side support 10 million patients should not be a problem on commodity server)
- Aggregated database doesn't allow for doing union, intersections etc. because of missing patient level information. For example how many patients were diagnosed with TB and HIV.
Further Analysis
- Embedding Jasper vs. New front end on JasperServer vs. Using Jasper as it is
- How to handle custom attributes
- How to handle observations made on concept. or in other words when the data type is not known in advance.
Spikes
- How difficult would it be to write MDX filters in Saiku
The Bahmni documentation is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)