This would be a living document to capture various baselining snapshots while :-
Troubleshooting and applying a patch
Changing Software or Network configurations
Adding new scenarios and changing load share
Source Code: https://github.com/Bahmni/performance-test
This would be a living document to capture various baselining snapshots while :-
Troubleshooting and applying a patch
Changing Software or Network configurations
Adding new scenarios and changing load share
Source Code: https://github.com/Bahmni/performance-test
Performance environment was running on AWS EKS Custer with single node
Node (EC2: m5-xlarge)
RAM 16GB
4 vCPU
100GB Secondary storage
AWS LINUX x86_64
Total 15 application pods in cluster such as openmrs, bahmni-web, postgresql, rabbitmq etc |
Database (AWS RDS service: db.t3.xlarge)
RAM 16GB,
4 vCPU (2 core, 2.5 GHz Intel Scalable Processor)
100GB Secondary storage
MySQL, max_connections = 1304
OpenMRS Tomcat - Server
Server version: Apache Tomcat/7.0.94 Server built: Apr 10 2019 16:56:40 UTC Server number: 7.0.94.0 OS Name: Linux OS Version: 5.4.204-113.362.amzn2.x86_64 Architecture: amd64 JVM Version: 1.8.0_212-8u212-b01-1~deb9u1-b01 ThreadPool: Max 200, Min 25 (Default server.xml) |
OpenMRS - Heap
Initial Heap: 256 MB
Max Heap: 768 MB
-Xms256m -Xmx768m -XX:PermSize=256m -XX:MaxPermSize=512m
Openmrs Connection Pooling
hibernate.c3p0.max_size=50 hibernate.c3p0.min_size=0 hibernate.c3p0.timeout=100 hibernate.c3p0.max_statements=0 hibernate.c3p0.idle_test_period=3000 hibernate.c3p0.acquire_increment=1 |
The performance test simulation(Gatling) will be executed in the client machine.
Client (EC2: c5-xlarge)
RAM 8GB
4 vCPU
8GB storage
AWS LINUX x86_64
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 543 Patients
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 543 Patients
OpenMRS JVM Configuration:
-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark |
Report Link: https://bahmni.github.io/performance-test/longduration_report-20221124061027477_40users_8hrs/index.html
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 1920 | 131 | 215 | 282 | 533 |
Existing Patient Search using ID → Start OPD Visit | 30% | 1440 | 32 | 91 | 167 | 245 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 1440 | 27 | 50 | 76 | 176 | |
Upload Patient Document | 10% | 480 | 88 | 152 | 202 | 327 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 1920 | 562 | 1065 | 1191 | 1592 |
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 2464 Patients
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 2464 Patients
OpenMRS JVM Configuration:
-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark |
Report Link: https://bahmni.github.io/performance-test/longduration_report-20221125061309057_50users_8hrs/index.html
Report Observations:
The execution was successful for 8 hours. But the OpenMRS went down after a time period when the environment was idle due to the same issue observed for 70 concurrent user 8 hour run available below. |
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 2440 | 130 | 226 | 295 | 478 |
Existing Patient Search using ID → Start OPD Visit | 30% | 1680 | 32 | 85 | 173 | 2272 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 1680 | 29 | 77 | 131 | 277 | |
Upload Patient Document | 10% | 480 | 89 | 143 | 166 | 263 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 2400 | 572 | 1199 | 1338 | 1716 |
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 8361 Patients
Network: 60 MBPS
Ramp Up: 5 mins
Database pre-state: 8361 Patients
OpenMRS JVM Configuration:
-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark |
Report Link: https://bahmni.github.io/performance-test/longduration_report-20221117114025035_70users_8hrs/index.html
Report Observations:
The test was up and running for 5 hours. Then the OpenMRS application went down due “Java Heap - Out of Memory” issue. The same issue was observed for every “70 concurrent users - 8 hours” test with OpenMRS going down at various time intervals. |
JVM Observation: Overall Heap Size: 1.94GB | CMS Old Gen Size: 1GB | Par Eden Space: 910MB *Note: All the above memory categories where maxed out when OpenMRS went down |
After the above test failures, complete analysis were performed across Bahmni lite services and found the health check services implemented under HIU(ABDM) for openMRS were constantly piling up the heap memory and caused the application to crash when it reach the maximum allocation. This issue had been successfully fixed and not reproducible again. |
HardwarePerformance environment was running on AWS EKS Custer with single node Node (EC2: t3-large)
Database (AWS RDS service: db.t3.xlarge)
SoftwareOpenMRS Tomcat - Server
OpenMRS - Heap
Openmrs Connection Pooling
|
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 75000 patient records
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 75000 patient records
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 5760 | 152 | 484 | 648 | 1389 |
Existing Patient Search using ID → Start OPD Visit | 30% | 4320 | 54 | 472 | 676 | 1977 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 4320 | 119 | 352 | 507 | 1492 | |
Upload Patient Document | 10% | 1440 | 142 | 482 | 581 | 1135 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 5760 | 1364 | 4056 | 4531 | 7291 |
HardwarePerformance environment was running on AWS EKS Custer with single node Node (EC2: m5-xlarge)
Database (AWS RDS service: db.t3.xlarge)
SoftwareOpenMRS Tomcat - Server
OpenMRS - Heap
Openmrs Connection Pooling
|
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 90500 patient records
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 90500 patient records
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 10080 | 130 | 252 | 309 | 674 |
Existing Patient Search using ID → Start OPD Visit | 30% | 7200 | 49 | 305 | 464 | 2660 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 7200 | 135 | 278 | 348 | 573 | |
Upload Patient Document | 10% | 2160 | 111 | 206 | 253 | 459 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 10080 | 998 | 2331 | 2608 | 4134 |
🔰 Observations:
Doctor Consultation - The maximum response times for this particular activity was pretty high under both 40 and 70 concurrent tests. The services responsible for this activity is under analysis and will be prioritised for performance improvement.
After finding that hip and crater atomfeed in creating more objects at runtime maxing out the eden space , The following test is done without atomfeed and updated to G1GC memory management |
HardwarePerformance environment was running on AWS EKS Custer with single node Node (EC2: m5-xlarge)
Database (AWS RDS service: db.t3.xlarge)
SoftwareOpenMRS Tomcat - Server
OpenMRS - G1GC Heap
|
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 189500 patient records
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 189500 patient records
Report Link: https://bahmni.github.io/performance-test/longduration_report-20230330120855322_100users_24_hrs_g1gc/index.html
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 14400 | 67 | 260 | 373 | 1176 |
Existing Patient Search using ID → Start OPD Visit | 30% | 10800 | 26 | 198 | 309 | 812 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 10800 | 233 | 531 | 677 | 1612 | |
Upload Patient Document | 10% | 3600 | 50 | 192 | 318 | 1203 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 14400 | 406 | 1122 | 1500 | 1500 |
🔰 Observations:
Removing the hip and crater atom feed pods and introducing the G1GC memory management has reduced the overall response time and also the CPU utilization .
HardwarePerformance environment was running on AWS EKS Custer with single node Node (EC2: m5-xlarge)
Database (AWS RDS service: db.t3.xlarge)
SoftwareOpenMRS Tomcat - Server
OpenMRS - G1GC Heap
|
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 223722 patient records
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 223722 patient records
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 14400 | 121 | 256 | 364 | 1156 |
Existing Patient Search using ID → Start OPD Visit | 30% | 10800 | 67 | 170 | 276 | 846 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 10800 | 287 | 628 | 821 | 1734 | |
Upload Patient Document | 10% | 3600 | 88 | 224 | 353 | 686 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 14400 | 590 | 1497 | 1862 | 2855 |
HardwarePerformance environment was running on AWS EKS Custer with single node Node (EC2: m5-xlarge)
Database (AWS RDS service: db.t3.xlarge)
SoftwareOpenMRS Tomcat - Server
OpenMRS - G1GC Heap
|
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 238182 patient records
Network: 60 Mbps
Duration: 24 hours
Ramp Up: 5 mins
Database pre-state: 238182 patient records
Report Observations:
Simulations | Scenario | Load share | Patient Count | Min Time (ms) | 95th Percentile (ms) | 99th Percentile (ms) | Max Time (ms) |
---|---|---|---|---|---|---|---|
Frontdesk 50% Traffic | New Patient Registration → Start OPD Visit | 40% | 17280 | 62 | 195 | 304 | 752 |
Existing Patient Search using ID → Start OPD Visit | 30% | 12960 | 23 | 170 | 283 | 918 | |
Existing Patient Search using Name → Start OPD Visit | 20% | 12960 | 287 | 630 | 785 | 1306 | |
Upload Patient Document | 10% | 4320 | 46 | 163 | 264 | 595 | |
Doctor 50% Traffic | Doctor Consultation
| 100% | 17280 | 414 | 1259 | 1592 | 3029 |
Note: The tests performed here are of most demand to users using Bahmni lite compared to the real world clinic activities. So it is assumed safe to implement for the suggested concurrent users under a cluster even when the maximum time response numbers for some activities are not optimal at the moment under test.