Capturing observations and data in Bahmni has been done primarily with the help of a keyboard and mouse. While this has worked until now, we wanted to explore faster methods for capturing patient data, thus providing increased consultation time with the patient (and reduce time on computer screens). In the long run, the speech assistant could also be enhanced in other areas of Bahmni like providing faster navigation, quick view of dashboards, etc.
After quick rounds of brainstorming activities, we converged on the idea to try out speech assistant for consultation notes and use it for initial user testing and general feedback. Some notable decisions were:
The button to initiate consultation box with speech assistant would be kept outside (i.e, on the patient dashboard) , so that the doctor could have a glance at the entire medical history of the patient in the patient dashboard and also capture consultation notes.
The consultation box would be a floating box and the doctor would be able to drag and move the box around on the screen according to the doctors convenience
The doctor can use the consultation box even from the patient dashboard or inside the consultation
Even when the doctor shifts tabs inside consultation session (example, medication, orders, etc) the floating box would remain as it is
The doctor can simultaneously record medications/diagnosis , etc inside consultation using keyboard and also record the notes at the same time.
The consultation box with speech assistant would be developed in a manner which ensures that it is decoupled from Bahmni and can be used by any other OpenMRS distro as a separate plugin.
Workflows for speech assistant
Initiating the speech assistant
The button to initiate speech assistant would be first found on the patient dashboard on the bottom right corner as seen in the screenshot below
The button would remain even if the doctor is on any other tab inside the consultation session of the patient
2. Recording in the consultation box
Once the doctor clicks on the button, the consultation box with the speech to text converter open as seen below
The doctor can drag and move the consultation box .
Once the box is open, we see that that “save notes” button is disabled. This is because there are no notes inside the box. The doctor can click on “start recording” now.
Once the doctor clicks on start recording, the doctor can start speaking to capture the notes. Also note, the save button is enabled only after the doctor clicks on stop button. Also, editing the notes while it is listening has been disabled.
In the screenshot above, we can see that the save notes has been enabled after the recording has been stopped.
Note : Doctor can also use keyboard to type in this box without using voice as the primary means
3. Saving the notes
Once the notes are saved, the doctor can verify in the following places at present:
→ Inside the visit summary
→ On the same consultation box
Barriers to adoption
While interacting with the doctors, we found the possible barriers that could lessen the adoption.
Doctors usually work in a noisy environment. Therefore, the speech to text should be capable enough to filter the ambient noises
Any technology or method that ensures speed in capturing information would likely be used by the doctors to reduce interaction time with the EMR and increase time for patient consultation. Along with speed, accuracy is also another critical factor to be considered. Right now, we are testing different models and settings to test the speed and accuracy
Doctors prefer writing on paper. Therefore, technologies like OCR, etc are likely substitutions
High Level Architecture Diagram:
Speech Assistant is a micro-frontend application, where all ui code resides. The UI code is built in React using Carbon Design System.
Currently the language model been used by Vakyansh is normal English model. To understand medical words, model needs to be trained with the relevant vocabulary. Trained medical model then can be used by Vakyansh to get the proper text back.
Vakyansh api works better when deployed in a GPU machine. One instance of api can easily serve upto 10 concurrent audio connections. If it needs to be increased further, the api needs to be scaled.
If NLP library is applied to understand the meaning from the sentence, then the usecase could be extended to other consultation tasks like Medications, Symptoms, etc.
The Bahmni documentation is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)