Medispeak Integration Guide: Streamlining Bahmni EMR Data Entry with Voice Assistance
The documentation for Medispeak integration is current as of January 10, 2025. Please note that any recent updates or changes to Medispeak may introduce improvements or modifications to its functionality and integration with Bahmni. We recommend referring to the latest Medispeak release notes or documentation for up-to-date information.
Overview
Empowering Doctors with Seamless Transcription and Smart EMR Integration
Medispeak integrates seamlessly with the Bahmni EMR, enabling voice-assisted form filling through advanced speech recognition. By transforming healthcare provider interactions with patients into structured data, Medispeak enhances the user experience by making form-filling faster, more accurate, and more accessible.
Purpose and Benefits
The primary objective of Medispeak is to streamline data entry within the EMR, reducing the manual effort traditionally associated with this task.
Benefits
Increased Accessibility: Makes the application more inclusive for users with disabilities or those who find typing difficult.
Enhanced Efficiency: Accelerates form-filling, enabling users to complete forms faster than traditional methods.
Reduced Errors: Improves data accuracy by minimizing manual entry errors.
Improved User Experience: Offers an engaging, user-friendly way to interact with the application.
Feature Flow
Medispeak consists of two main components:
Frontend Plugin
The frontend component manages user interactions:
Records the user’s voice.
Sends audio data to the backend service.
Maps the backend response to the appropriate form fields.
Backend
The backend serves as a proxy between the frontend and the language model service, handling:
Audio file processing.
Transcription and data extraction.
Communication with the frontend.
High-Level Feature Operation
Form Configuration: An admin sets up the form structure in the Medispeak admin portal, defining form fields, descriptions, and metadata.
Voice Recording: Users initiate voice recording by clicking the "Start Recording" button on a configured form.
Processing: The audio is recorded, uploaded, and transcribed into text. The transcription is mapped to relevant form fields.
Form Filling: The frontend dynamically updates the form with transcribed data in real time.
Submission: The user can validate and submit the form on to the EMR.
Architecture
Frontend Architecture
The Medispeak frontend facilitates voice-to-form interactions through three primary actions:
Recording User Input
The Player Component manages voice recording using the custom hook. Key functions include:
Start Recording: Initiates recording.
Restart: Restarts the recording process.
Resume: Resumes paused recordings.
Transcribe: Sends audio to the backend for processing.
Sending Audio to the Backend
After recording, the captured audio is uploaded to the backend. The frontend monitors transcription progress through server polling, ensuring seamless communication between components.
Updating Form Fields
Upon receiving structured data from the backend, the frontend dynamically updates the relevant form fields, providing a smooth user experience by auto-filling content in real time.
Backend Architecture
The backend processes uploaded audio files through the following steps:
Saving Audio Files: Audio files are stored in an S3 bucket.
Fetching Files: Retrieves files from S3 for processing.
Transcription: Converts speech to text using OpenAI's Whisper API.
Data Extraction: Utilizes GPT-3.5 to interpret transcripts and extract structured form data.
Task Completion: Updates the transcription object's status to "Completed" and stores extracted data.
These tasks are handled asynchronously by a background worker, ensuring non-blocking operations.
Backend Setup
This guide outlines the steps for setting up the Medispeak backend using Docker Compose.
Prerequisites
Ensure the following tools are installed:
Docker: For containerized services.
Docker Compose: To manage multiple containers.
OpenAI Credentials: To access the
whisper-1
andgpt-3.5-turbo
model for speech-to-text transcription and context-based form filling
Setting Up OpenAI API for Medispeak Integration
Medispeak leverages OpenAI APIs for speech-to-text transcription and context-based form filling. To enable this functionality, you must set up an OpenAI account and generate API keys to connect Medispeak to OpenAI services.
Follow the steps below to set up OpenAI for Medispeak:
1. Create an OpenAI Account
Visit OpenAI's platform and log in to your account. If you do not have an account, sign up to create one.
2. Create a New Project
Navigate to the Projects page.
Click on Create New Project and provide a name of your choice for the project.
3. Generate API Keys
Go to the API Keys page within your OpenAI account.
Click the Create New Secret Key button.
In the pop-up dialog box, add a name for your API key. This name helps you identify the key in the future.
4. Secure Your API Key
Once the key is generated, save it securely in an accessible location. For security purposes, OpenAI will not display the secret key again in your account.
If you lose the key, you will need to generate a new one by repeating the steps above.
5. Using Multiple API Keys
OpenAI allows you to create multiple API keys for different purposes within a single project. Use this feature to organize and manage your API usage efficiently.
With your API key set up, you are ready to configure Medispeak to integrate seamlessly with Bahmni EMR, enabling advanced voice-assisted form filling.
Setup Steps
Prepare the Environment
Create a .env
file in the project root with essential variables:
# OpenAI Credentials
OPENAI_ACCESS_TOKEN=your_token
OPENAI_ORGANIZATION_ID=your_id
# Plugin Configuration
PLUGIN_BASE_URL=http://localhost:3000
# Rails Environment
RAILS_ENV=development
BACKEND_PORT=3000
# AWS S3 Configuration
AWS_ACCESS_KEY_ID=your_key
AWS_SECRET_ACCESS_KEY=your_secret
AWS_REGION=your_region
AWS_BUCKET=your_bucket
# PostgreSQL Configuration
POSTGRES_IMAGE_TAG=14.2-alpine
DB_NAME=medispeak
DB_USERNAME=postgres
DB_PASSWORD=postgres
DB_PORT=5432
Build and Start Services
Build and start services in detached mode:
docker-compose up -d
Run Database Migrations
To get started with the default credentials and setup, run the following command
docker-compose exec medispeak_backend bundle exec rails db:migrate
Seed the Database
To initialize the Medispeak application with predefined form fields for Registration, Vitals, and Medication forms, follow these steps:
Step 1: Replace the Seed File
Locate the existing
db/seeds.rb
file.Replace it with the updated seed file provided :
Step 2: Build the Docker Image
Once the seed file is replaced, build the Docker image to ensure the changes are included:
docker-compose build
Step 3: Run the Database Seeding Command
After the image is built, run the following command to seed the data into the database:
docker-compose exec medispeak_backend bundle exec rails db:seed
The database is now seeded with the predefined form fields, and the Medispeak application is ready for use.
Optional MinIO Setup
To use MinIO as an alternative to S3:
Add MinIO credentials in
.env
:
MINIO_ACCESS_KEY_ID=your_key
MINIO_SECRET_ACCESS_KEY=your_secret
MINIO_ENDPOINT=https://minio.example.com
Switch the storage service:
STORAGE_SERVICE=minio
Start with MinIO:
docker-compose --profile minio up -d
Frontend Setup
This guide outlines the steps to set up the Medispeak frontend plugin for development or as a Chrome extension.
Prerequisites
Ensure the following tools are installed:
Node.js (Version 14 or later):
Verify with
node --version
npm or yarn:
Verify with
npm --version
oryarn --version
Google Chrome
Setup Steps
Clone the Repository
git clone https://github.com/medispeak/medispeak.git cd medispeak
Install Dependencies
npm install
Set the base URL
Update theBASE_URL
insrc/api/Api.js
.const BASE_URL = "http://localhost:3000/";
Build the Plugin
npm run build
The build files will be generated in the dist
directory.
Using as a Chrome Extension
Open Chrome and navigate to
chrome://extensions
.Enable Developer Mode.
Click Load unpacked and select the
build
directory.Restrict plugin usage to specific domains via extension settings.
Demo Credentials
Head over to http://localhost:3000/. A demo account is pre-configured for testing:
Username: admin@example.com
Password: password123
Accessing the Admin Panel
Log in using the demo credentials.
Navigate to http://localhost:3000/admin.
Getting Started
To get started, configure Medispeak as an administrator. Head over to http://localhost:3000/admin/ and add the following:
Add New Templates
Fields:
Name: Template name.
Description: Brief description of the template.
Add New Domain
Fields:
Fqdn: The domain of the EMR (e.g., "https://demo.mybahmni.org/").
Template: Select from the templates created earlier.
Add a New Page
Fields:
Name: Page name.
Prompt: A brief description of the page.
Template: Select from the templates created earlier.
Add a New Form Field
Fields:
Page: Select from the pages created earlier.
Friendly Name: Display name of the form field.
Title: Form field identifier.
Description: Brief description of the form field.
Field Type: Type of form field (options: String, Number, Boolean, Single select, Multi select).
Minimum: Optional minimum value (if applicable).
Maximum: Optional maximum value (if applicable).
Enum Options: Enter options for single/multi-select fields, one per line.
Setting Up the Medispeak Frontend Plugin: API Key Configuration
To integrate the Medispeak plugin with the frontend, follow these steps to generate and configure the API key:
Step 1: Generate an API Token
Navigate to the API Tokens tab, accessible from the top-right corner of the screen.
Click on Generate New Token.
Provide a descriptive name for the token and set an expiry date.
Click Generate Token.
Once the token is created, copy the API token for later use.
Step 2: Configure the Medispeak Plugin in Bahmni
Open Bahmni and locate the Medispeak plugin in the bottom-right corner of the screen.
Click on the plugin to open its settings.
Paste the copied API token into the designated field.
Click Save to authenticate the frontend plugin.
The Medispeak plugin is now configured and ready for use.
Demo
Costs, Performance, and Scalability
Speech-to-Text: Uses OpenAI’s Whisper API, known for high accuracy with English and medical terminology.
Form Filling: Utilizes GPT for interpreting transcripts and filling forms; performance varies by form complexity.
Object Storage: Stores audio files on Amazon S3, offering scalable storage solutions.
Recommendations
Consider MinIO as a self-hosted alternative to S3 for long-term cost savings.
Evaluate storage retention policies to manage costs effectively.
This documentation provides a comprehensive overview of Medispeak’s features, architecture, and setup processes, ensuring an optimized user experience and efficient implementation.
Next Step
- 1 Overview
- 2 Purpose and Benefits
- 2.1 Benefits
- 3 Feature Flow
- 4 Architecture
- 5 Backend Setup
- 5.1 Prerequisites
- 5.2 Setting Up OpenAI API for Medispeak Integration
- 5.2.1 1. Create an OpenAI Account
- 5.2.2 2. Create a New Project
- 5.2.3 3. Generate API Keys
- 5.2.4 4. Secure Your API Key
- 5.2.5 5. Using Multiple API Keys
- 5.3 Setup Steps
- 5.3.1 Prepare the Environment
- 5.3.2 Build and Start Services
- 5.3.3 Run Database Migrations
- 5.3.4 Seed the Database
- 5.4 Optional MinIO Setup
- 6 Frontend Setup
- 7 Demo Credentials
- 8 Getting Started
- 9 Demo
- 10 Costs, Performance, and Scalability
- 10.1 Recommendations
- 11 Next Step
The Bahmni documentation is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)