Bahmni Lite Performance Long Duration Simulation Baselining

Bahmni Lite Performance Long Duration Simulation Baselining

This would be a living document to capture various baselining snapshots while :-

  • Troubleshooting and applying a patch

  • Changing Software or Network configurations

  • Adding new scenarios and changing load share

Source Code: GitHub - Bahmni/performance-test

Automation Technology Stack

Base Configuration

Hardware

Performance environment was running on AWS EKS Custer with single node

Node (EC2: m5-xlarge)

  • RAM 16GB

  • 4 vCPU

  • 100GB Secondary storage

  • AWS LINUX x86_64

Total 15 application pods in cluster such as openmrs, bahmni-web, postgresql, rabbitmq etc

Database (AWS RDS service: db.t3.xlarge)

  • RAM 16GB,

  • 4 vCPU (2 core, 2.5 GHz Intel Scalable Processor)

  • 100GB Secondary storage

  • MySQL, max_connections = 1304

Software

OpenMRS Tomcat - Server

Server version: Apache Tomcat/7.0.94 Server built: Apr 10 2019 16:56:40 UTC Server number: 7.0.94.0 OS Name: Linux OS Version: 5.4.204-113.362.amzn2.x86_64 Architecture: amd64 JVM Version: 1.8.0_212-8u212-b01-1~deb9u1-b01 ThreadPool: Max 200, Min 25 (Default server.xml)

OpenMRS - Heap

  • Initial Heap: 256 MB

  • Max Heap: 768 MB

-Xms256m -Xmx768m -XX:PermSize=256m -XX:MaxPermSize=512m

Openmrs Connection Pooling

hibernate.c3p0.max_size=50 hibernate.c3p0.min_size=0 hibernate.c3p0.timeout=100 hibernate.c3p0.max_statements=0 hibernate.c3p0.idle_test_period=3000 hibernate.c3p0.acquire_increment=1

Client Configuration

The performance test simulation(Gatling) will be executed in the client machine.

Client (EC2: c5-xlarge)

  • RAM 8GB

  • 4 vCPU

  • 8GB storage

  • AWS LINUX x86_64


📗 40 Concurrent Users - 8 Hours

  • Network: 60 MBPS

  • Ramp Up: 5 mins

  • Database pre-state: 543 Patients

OpenMRS JVM Configuration:

-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark

Report Link: https://bahmni.github.io/performance-test/longduration_report-20221124061027477_40users_8hrs/index.html

Report Observations:

Needs to be ANALYSED

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Frontdesk

50% Traffic

New Patient Registration Start OPD Visit

40%

1920

131

215

282

533

Existing Patient Search using ID Start OPD Visit

30%

1440

32

91

167

245

Existing Patient Search using Name Start OPD Visit

20%

1440

27

50

76

176

Upload Patient Document

10%

480

88

152

202

327

Doctor

50% Traffic

Doctor Consultation

  • 8 Observations

  • 2 Lab Orders

  • 3 Medication

100%

1920

562

1065

1191

1592

 

📗 50 Concurrent Users - 8 Hours

  • Network: 60 MBPS

  • Ramp Up: 5 mins

  • Database pre-state: 2464 Patients

OpenMRS JVM Configuration:

-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark

Report Link: https://bahmni.github.io/performance-test/longduration_report-20221125061309057_50users_8hrs/index.html

Report Observations:

The execution was successful for 8 hours. But the OpenMRS went down after a time period when the environment was idle due to the same issue observed for 70 concurrent user 8 hour run available below.

Needs to be monitored

Needs to be ANALYSED

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Frontdesk

50% Traffic

New Patient Registration Start OPD Visit

40%

2440

130

226

295

478

Existing Patient Search using ID Start OPD Visit

30%

1680

32

85

173

2272

Existing Patient Search using Name Start OPD Visit

20%

1680

29

77

131

277

Upload Patient Document

10%

480

89

143

166

263

Doctor

50% Traffic

Doctor Consultation

  • 8 Observations

  • 2 Lab Orders

  • 3 Medication

100%

2400

572

1199

1338

1716

 

📕 70 Concurrent Users - 8 Hours

  • Network: 60 MBPS

  • Ramp Up: 5 mins

  • Database pre-state: 8361 Patients

OpenMRS JVM Configuration:

-Xms2048m -Xmx2048m -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MetaspaceSize=768m -XX:MaxMetaspaceSize=768m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=16 -XX:TargetSurvivorRatio=50 -XX:MaxTenuringThreshold=15 -XX:+UseParNewGC -XX:ParallelGCThreads=16 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark

Report Link: https://bahmni.github.io/performance-test/longduration_report-20221117114025035_70users_8hrs/index.html

Report Observations:

The test was up and running for 5 hours. Then the OpenMRS application went down due “Java Heap - Out of Memory” issue. The same issue was observed for every “70 concurrent users - 8 hours” test with OpenMRS going down at various time intervals.

JVM Observation:

Overall Heap Size: 1.94GB | CMS Old Gen Size: 1GB | Par Eden Space: 910MB

*Note: All the above memory categories where maxed out when OpenMRS went down


🟣 Tests after HIU(ABDM) Fix

After the above test failures, complete analysis were performed across Bahmni lite services and found the health check services implemented under HIU(ABDM) for openMRS were constantly piling up the heap memory and caused the application to crash when it reach the maximum allocation. This issue had been successfully fixed and not reproducible again.

📗 40 Concurrent Users - 24 Hours

 

Hardware

Performance environment was running on AWS EKS Custer with single node

Node (EC2: t3-large)

  • RAM 8GB

  • 2 vCPU

  • 100GB Secondary storage

  • AWS LINUX x86_64

Total 20 application pods in cluster such as openmrs, bahmni-web, postgresql, abdm etc

Database (AWS RDS service: db.t3.xlarge)

  • RAM 16GB,

  • 4 vCPU (2 core, 2.5 GHz Intel Scalable Processor)

  • 100GB Secondary storage

  • MySQL, max_connections = 1304

Software

OpenMRS Tomcat - Server

Server version: Apache Tomcat/7.0.94 Server built: Apr 10 2019 16:56:40 UTC Server number: 7.0.94.0 OS Name: Linux OS Version: 5.4.204-113.362.amzn2.x86_64 Architecture: amd64 JVM Version: 1.8.0_212-8u212-b01-1~deb9u1-b01 ThreadPool: Max 200, Min 25 (Default server.xml)

OpenMRS - Heap

  • Initial Heap: 1024 MB

  • Max Heap: 1536 MB

-Xms1024m -Xmx1536m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=1024m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=40 -XX:+UseParNewGC -XX:ParallelGCThreads=2 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark -XX:+UseGCOverheadLimit -XX:+UseStringDeduplication

Openmrs Connection Pooling

hibernate.c3p0.max_size=50 hibernate.c3p0.min_size=0 hibernate.c3p0.timeout=100 hibernate.c3p0.max_statements=0 hibernate.c3p0.idle_test_period=3000 hibernate.c3p0.acquire_increment=1

Report

  • Network: 60 Mbps

  • Duration: 24 hours

  • Ramp Up: 5 mins

  • Database pre-state: 75000 patient records

Report Link: https://bahmni.github.io/performance-test/longduration_report-20230130141239257_40users_24hrs_all_omods_afterhipfix/index.html

Report Observations:

Needs to be ANALYSED

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Frontdesk

50% Traffic

New Patient Registration Start OPD Visit

40%

5760

152

484

648

1389

Existing Patient Search using ID Start OPD Visit

30%

4320

54

472

676

1977

Existing Patient Search using Name Start OPD Visit

20%

4320

119

352

507

1492

Upload Patient Document

10%

1440

142

482

581

1135

Doctor

50% Traffic

Doctor Consultation

  • 8 Observations

  • 2 Lab Orders

  • 3 Medication

100%

5760

1364

4056

4531

7291

📗 70 Concurrent Users - 24 Hours

 

Hardware

Performance environment was running on AWS EKS Custer with single node

Node (EC2: m5-xlarge)

  • RAM 16GB

  • 4 vCPU

  • 100GB Secondary storage

  • AWS LINUX x86_64

Total 20 application pods in cluster such as openmrs, bahmni-web, postgresql, abdm etc

Database (AWS RDS service: db.t3.xlarge)

  • RAM 16GB,

  • 4 vCPU (2 core, 2.5 GHz Intel Scalable Processor)

  • 100GB Secondary storage

  • MySQL, max_connections = 1304

Software

OpenMRS Tomcat - Server

Server version: Apache Tomcat/7.0.94 Server built: Apr 10 2019 16:56:40 UTC Server number: 7.0.94.0 OS Name: Linux OS Version: 5.4.204-113.362.amzn2.x86_64 Architecture: amd64 JVM Version: 1.8.0_212-8u212-b01-1~deb9u1-b01 ThreadPool: Max 200, Min 25 (Default server.xml)

OpenMRS - Heap

  • Initial Heap: 1024 MB

  • Max Heap: 2536 MB

-Xms1024m -Xmx2536m -XX:NewSize=512m -XX:MaxNewSize=512m -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=1024m -XX:InitialCodeCacheSize=64m -XX:ReservedCodeCacheSize=96m -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=40 -XX:+UseParNewGC -XX:ParallelGCThreads=2 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+CMSCompactWhenClearAllSoftRefs -XX:CMSInitiatingOccupancyFraction=85 -XX:+CMSScavengeBeforeRemark -XX:+UseGCOverheadLimit -XX:+UseStringDeduplication

Openmrs Connection Pooling

hibernate.c3p0.max_size=50 hibernate.c3p0.min_size=0 hibernate.c3p0.timeout=100 hibernate.c3p0.max_statements=0 hibernate.c3p0.idle_test_period=3000 hibernate.c3p0.acquire_increment=1

Report

  • Network: 60 Mbps

  • Duration: 24 hours

  • Ramp Up: 5 mins

  • Database pre-state: 90500 patient records

Report Link: https://bahmni.github.io/performance-test/longduration_report-20230213133118638_70users_24hours_AfterHIPfix_m5xlarge/index.html

Report Observations:

Needs to be ANALYSED

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Simulations

Scenario

Load share

Patient Count

Min Time (ms)

95th Percentile (ms)

99th Percentile (ms)

Max Time (ms)

Frontdesk

50% Traffic

New Patient Registration Start OPD Visit

40%

10080

130

252

309

674

The Bahmni documentation is licensed under Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)