1 Goals
2 Performance Test Plan Strategy
- 2.1 Test Strategy
- 2.2 Test Scenarios
- 2.3 Infra Setup
- 2.4 Required Software
- 2.5 Code repository
- 2.6 Test Execution Steps
- 2.7 Java Profiling
3 Findings & Remediation
4 Future Recommendations for Performance Testing

Goals

To publish the baseline reports and advantage of having upgraded to new software components.
To publish the capacity planning reports and to be able to predict per-facility cloud running hardware costs with different hardware contexts.
To publish a Roadmap for the next set of experiments or features that promise to improve the performance (or reduce the per-facility costs) considerably.
To integrate performance test runs with Bahmni deployments and be available to anyone in the community to modify/run/benchmark their own Bahmni deployments.

More details here

Performance Test Plan Strategy

Test Strategy

The strategy mainly focuses on doing realistic stress testing on Bahmni LITE environment by maintaining the following criteria

To have pauses between each user interaction thus maintaining the breathing time for each persona.
To have the overall load shared between each persona based on their breathing time.
To have a ramp-up and ramp-down of users at the beginning and end of each test.
Each test will start with a set of patients for the doctors to start the consultation from the beginning of the test.
To maintain a seamless connection of scenarios between patient registration and consultation.
To have a hard stop time at the end of the test to control the overall test duration.

More details here

Test Scenarios

The performance test suite has the following test scenarios developed

New Patient - Registration - Start OPD Visit
Existing Patient - Patient Search - Start OPD Visit
Upload Patient Document
Doctor Consultation and Observations Flow

More details here

Infra Setup

The Performance Test environment runs on Kubernetes on AWS.

A separate namespace is created with a Bahmni Kubernetes Installation.
The existing RDS is shared with the performance namespace.
For Monitoring, Grafana and JVM Dashboard are added.

More details here

Required Software

JDK 11
Gradle
Nodejs
Newman
Aws credentials (Needed only to run the test on the cloud)
Access for GitHub actions (Needed only to run the test on the cloud)
Yourkit Java profiler(Get license from Infra Team)
Network Bandwidth controller - Wondershaper

Code repository

Archive Report Path - GH Pages

Test Execution Steps

Clone all the repositories.
Use the Wondershaper to set the network speeds only if needed.
Run the test data generator to create and upload new patients.
Copy the registrations.csv file from /output to /src/gatling/resources .
Start the test by providing the simulation type , number of users, and duration of the test.
To run the test against different environments update respective env properties in
- src/gatling/scala/configurations/protocols.scala and src/gatling/scala/api/constants.scala
To run the test in cloud use the trigger in GH actions.

More details here & here

Java Profiling

Made use of YourKit profiling tool to profile JVM while running performance executions
Helped in analysing CPU and memory utilisation, troubleshoot code that slows down API responses, locating possible deadlocks and so on.
Setting up YourKit on a remote machine can be found here.

Findings & Remediation

📗 Baseline Test Observations

By default, Openmrs comes with Open JVM memory management which is not optimal for applications with large memory footprints. So we moved to CMS(Concurrent Mark Sweep) which gave us a low GC pause time and higher throughput for minimal patient data. - BAH-2660.
We have configured the min, max heap size and parallel GC threads.
This change has reduced the max time taken by the POST API call to save encounters for 90 users test run from 4149 ms to 1551 ms.

More details about the baseline test reports can be found here

📗 Long Duration Test Observations (24 hour test runs)

Saving the consultation page takes more time due to a groovy parse class function, By disabling the parse class function the response time for a single API call is reduced from 2.5s to 1s - BAH-2870.
The HIP health check module was pinging OpenMRS patients and visit API every 5 seconds causing the environment to go down due to Out-of-Memory Exception constantly whenever the patients count reaches 125k - BAH-2441, BAH-2783. (this was fixed). The fix for this issue has reduced the max time taken by the POST API call to save encounters for 70 users test run from 60s to 4s.
HIP and Crater atom feed were also pinging OpenMRS to query the event feeds causing high GC pauses which in turn spiking CPU utilization - BAH-2801, BAH-2912.
The update of GC strategy from CMS to G1GC has helped to control the CPU spike.
Without the HIP, Crater atomfeed and updated G1GC settings the 99th percentile has reduced to 1.5s.

More details about the JVM configurations, infra setup and long duration test runs can be read here: Bahmni Lite Performance Long Duration Simulation Baselining

Bahmni lite Cost Estimates (Projected)

Based on the long-duration test results and corresponding AWS utilization bills we have come up with a cost calculator. Link: Bahmni LITE Infra-cost estimates (based on Performance testing)
Anyone can create a cloud cost estimation to set up Bahmni LITE by providing the no. of users, users-per-clinic, and operational hours.

The assumption for load pattern is as per our Test suite. If the operations being performed at your facility are different than the scenarios in Test suite, then the results won’t match as-is. Please review the test scenarios to get a better understanding of the performance work done by the team.

Future Recommendations for Performance Testing

Troubleshoot / Improvement stories

Test the environment with multitenancy.
Update the test suite with the latest changes in the application - BAH-2903.
Reduce the impact of HIP and Crater atomfeed on openmrs - BAH-2948.
Optimize the API response time - BAH-2871 , BAH-2890, BAH-2891 , BAH-2892 , BAH-2893.
Optimize the application memory management - BAH-2949 .
Optimize the duplicate SQL queries - BAH-2716.
Backlog stories list.

Bahmni Wiki

Bahmni Performance Testing Journey (High Level Summary)

Goals

Performance Test Plan Strategy

Test Strategy

Test Scenarios

Infra Setup

Required Software

Code repository

Test Execution Steps

Java Profiling

Findings & Remediation

📗 Baseline Test Observations

📗 Long Duration Test Observations (24 hour test runs)

Bahmni lite Cost Estimates (Projected)

Future Recommendations for Performance Testing