蜗牛壳 --Testing Tech Snippets

General Goal -> Finding the performance bottleneck and regressions by simply...

Running a API level testing
Measuring the key performance Indicators
Analysis the performance result and trend
Isolate the external dependencies if needed (focus on your own code rather than anything else out of your control)

In this wiki, we will adopt k6.io as the performance/load testing tool, which is easy to setup and run locally, meanwhile, create complete monitoring system to visualize your test results as well as essential JVM performance metrics. In terms of isolating the external dependencies, we will create a docker based mock service, so that we can control the pace and customize response body to simulate different scenario with minimum effort.

If you want to conduct a scenario based performance testing towards an integration env such as staging env, I would recommend to use JMeter to do so, it is comprehensive and more mature tool, but it is out of this wiki's scope. We are not going to talk about the stress test, soak test or capacity test, since they need a more standard(production mimic or equally scaled) env and different test strategy, need thoughtful plan and focus on what we want to achieve by various experiment. The good thing is once you understand the basics of performance testing, you will be easily to have a better understanding with the other type of tests.

What I talk about when I talk about performance

My Daily life about Performance Engineering Cycle:

Performance is a generic term, it is difficult to give this word a concrete definition from single perspective. Performance issues could be caused by one or many factors, you may spend lots of time to find the right piece(s), clues or even using your educated guess to isolate the factors, prove your findings and resolve the issues. That's why performance issues always hard and some nerds are so obsessed with trouble shooting performance problems..

Why Local Performance test? (AKA, Unit test for API Performance)

The local environment is a great treasure(Any project can not be set up a local environment easily should be retired, seriously)!! It’s where we should be coding our load test scripts and from where we should initiate our load tests. Meanwhile, when I try to define "Local", here is not only referring to your own desktop or laptop, but any environments you are fully controlling and easy to manage and make changes without any impact to others.

Pros:

Easy to control
Flexible to manage your dependencies
Easy to setup and Test is cheap

Cons:

Hardware Spec limitation
Hard to compare with previous baseline
Difficult to simulate the complex scenario

K6 Local Env Setup:

To install the K6.io on Mac OS, Simply run following cmd:

brew install k6

if you are using the other OS to run the tests, please refer to this link

Create a Simple API Test script using javascript and k6 lib:

API Level Performance testing supposed to be simple and straight-forward, so Dev could run it easily and often once they make any changes.

k6.io adopts javascript as its scripting language, and Go lang as its backbones. For detailed usage of k6.io, you can start with using K6 documentation

In general, the k6 test script at least contains a few blocks :

import used libs
define global const variables
define customized metrics/checks
define test running configs
init code function, just run once for all VUs, eg: deal with data parameterization (optional)
VU test code function, the scenario/steps for each VU
teardown code function, just run once for all VUs before ending/shutdown the tests (optional)

To simplify what i mentioned above, we will use following API Test script as a test template which provides the essiential elements and components to run a local perf test, for example naming your test script as sample_script.js:

import http from 'k6/http';

import { check, sleep } from 'k6';

import { Rate } from 'k6/metrics';

const SLEEP_DURATION = 0.2;

const PROTOCOL = "https"

const HOST_NAME = "test-api.k6.io";

//Define custom metrics

let successRate = new Rate("check_success_rate");

//Test running configs

export let options = {

  discardResponseBodies: false,

  userAgent: 'MyK6UserAgentString/1.0',

  scenarios:{

    http_get_api_3RPS: {

      executor: 'constant-arrival-rate', // use open model instead of close model

      rate: 3, // 3 RPS

      timeUnit: '1s',

      duration: '30s',

      preAllocatedVUs: 5,

      maxVUs: 15,

      startTime: '0s', // config stage tests

},

    http_get_api_3RPS: {

      executor: 'constant-arrival-rate', // use open model instead of close model

      rate: 5, // 5 RPS

      timeUnit: '1s',

      duration: '30s',

      preAllocatedVUs: 5,

      maxVUs: 15,

      startTime: '31s', // config stage tests

},

},

  thresholds: {

    http_req_duration: ['p(90) < 250'],

    'check_success_rate': [{

      threshold: 'rate > 0.95',

      abortOnFail: true,

      delayAbortEval: '15s'}],

}

}

//Init code

export function setup() {

  console.log("Init Testing..." + new Date().toLocaleString());

  return Date.now();

}

//VU test code

export default function() {

  // Send out the API

  const response = http.get(`${PROTOCOL}://${HOST_NAME}/public/crocodiles/?format=json`, {

    cookies: { my_cookie: "123456" },

    headers: { 'X-MyHeader': "apitest" },

    timeout: "15s",

    compression: "gzip, deflate, br",

    tags: {name: 'APINAME--GET'},

});

  // Assert the response

  const checkResp = check(response, { // can be a combination assertion

    "response code is 200": (resp) => resp.status === 200,

    "content is present": (resp) => resp.body.includes("Bert"),

});

  successRate.add(checkResp);

  // Simulate the think time

  sleep(Math.random() * SLEEP_DURATION);

}

//TearDown code

export function teardown(data) {

  console.log(`Test duration: ${ Date.now()- data }ms`);

}

During scripting phase, we prefer to do Data Parameterization, so that we can try to avoid the cache and simulate the real world scenario, following is the typical methods we can use to deal with this: https://k6.io/docs/examples/data-parameterization/ or you can refer to one sample scripts i write in git repo

For some use cases, if the target API needs the other API's output as its input, this is called Correlation. For example, we can extract the data from previous API response body and compose this data as the input parameter to the API we want to measure most. k6 has the option to parser the response body and grab what you need for further steps(make sure you have the running config: discardResponseBodies: false). More example with correlation: https://k6.io/docs/examples/correlation-and-dynamic-data/

Recommendation: in Local performance testing, we should avoid as much dependency as possible, using Mock services or generate "fake data" to remove the dependency as much as possible. Focus on your code and design first!

To run your test script locally once you prepare the scripts, execute following CLI after cd to your test script folder, usually you start your test with smoke testing to make sure your scripts has no Errors or unexpected results:

k6 run sample_script.js

Once the script is ready to do load testing , then you can tweak your testing running configs in script or you can overwrite some critical configs through CLI to meet your load target.

After all, we want to smell our own API, get confidence before you submit your commits and go to prod to monitoring your API with something

Some typical use case examples: https://k6.io/docs/examples/

K6 API documentations: https://k6.io/docs/javascript-api/

Test result visualization:

Prefer to use influxDB + grafana to store and visualize your test result over time, so you can easily to notice the changes and time to go wrong, also easy to compare from time to time.

Install influxDB on your Mac OS, currently k6 does not support influxDB 2.0, so we will still use influxDB 1.8 until they add support 2.0 support officially:

brew install influxdb@1

Start influxDB instance on local (background mode), so it listens to 8086 port by default for exchange the data:

brew services start influxdb@1
or
nohup /usr/local/opt/influxdb@1/bin/influxd &

To run the k6 test and store the test data in local influxdb instance, in following example, it will create "myk6db" database automatically:

k6 run --out influxdb=http://localhost:8086/myk6db sample_script.js

Install Grafana on Mac OS:

brew install grafana

Start Grafana service:

brew services start grafana

Access to your local grafana page by : http://localhost:3000/ , enter admin for username and password.

Next, Add influxDB myk6db datasource and Create your own Dashboard to visualize the k6 test results:

(If you would like to add grafana panel plugin to build fancy dashboard, you can try to download the plugin folder and drop into grafana plugin folder: /usr/local/var/lib/grafana/plugins/)

I have defined a basic k6 test grafana dashboard for anyone to import as a quick start, feel free to download it from my github repo
P.S. Highly recommend to run the baseline before you make the changes and not compare with your out-of-date "baseline", things can be changed since it is local env.

Monitoring Your (Java) application:

To monitoring local Java process is easy nowadays, I recommend you to use JAVA Mission Control (JMC) and Flight Recorder(JFR) which developed by Oracle JAVA team. You can download the latest version of JMC separately from here , and how to start the JMC. The other option you may want to choose is VisualVM, one of my previous fav monitoring tool for JVM.

Configure your Java application correctly for the VM options, just make sure you copy the same JVM options currently in use from production for your own role.

If it is newly developed, you can try to configure by yourself or use following simple template to get started, if the GC overhead is bottleneck, you have to revisit and tuning it. If GC throughput is over 99.5%+(which means GC timing spend less than 0.5% of your whole testing), you normally do not need to bother JVM options. Keep the JVM options minimum, make sure you fully understand the impact before you add it.

G1 GC is my recommendation if you are on JDK8+ in general, however, if you are on JDK11+(ZGC) or JDK12+(Shenandoah), you may do the comparison between newly added GC Collectors and G1 GC. Assume you have at least 16GB RAM on your local machine, and you wish to sizing your java heap space at 4GB:

-Xms4096m
-Xmx4096m
-Xss256k
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:+DisableExplicitGC
-XX:+UseStringDeduplication
-XX:+ParallelRefProcEnabled
-XX:MaxMetaspaceSize=512m
-Djava.rmi.server.hostname=192.168.0.xxx

P.S. -Djava.rmi.server.hostname VM options need to be added to your Java application to let JMC or visualVM to connect to this host, otherwise, it may have following Error when trying to connect to jmx server:

...
Caused by: java.rmi.ConnectException: Connection refused to host: <Some_else_IP>; nested exception is:
java.net.ConnectException: Operation timed out (Connection timed out)
...

Pay attention: If you connect to you VPN, then you might have a separate IP address to connect to, run following cmd on you local:

% ifconfig | grep "inet "

It will show you the IP address you could use, if you could not decide which one to use, try both until it is connected.

In order to use JMC to monitor or use JFR to profiling and analyze your Java application, it is out of this wiki's scope, please find out here. For JFR tool, you need add additional VM options to enable it, please make sure do not enable the JFC VM configs in production env since it needs additional commercial license and adding some overhead to your services or using OpenJDK JMC and JFR for free (you need to use OpenJDK 11+).

The Key Java performance metrics you need to pay attention to:

JAVA CPU%
Machine CPU%
Heap Memory Usage/Footprint
Non-Heap Memory Usage/Footprint
GC throughput, GC timings and GC Frequency
Java Threads count/trend
JDBC Connection Stats
System Level performance metrics(collect separately, but on local testing, it is optional)

P.S. Highly recommend to save the key JMX metrics to influxDB during the local testing, so you can get a historic point of view and compare how things change time to time. So you can use jmxtrans together with jmxtrans-output-influxdb to export important JVM metrics to influxDB, and visualize it in Grafana.

Install jmxtrans on Mac OS :
brew install jmxtrans
Instrument the JVM options to export jmx port:
-Dcom.sun.management.jmxremote.port=9426
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
Instrument the JVM options to define the hostname for connection with jmx server :
-Djava.rmi.server.hostname=<Local_IP_Address>

Define jmxtrans configuration file, for example, save the file as "~/Tools/k6_Loadtest/jmxtrans_config/jmxconfig.json":

{
"servers":[
{
"port":"9426",
"host":"<Local_IP_Address>",
"runPeriodSeconds": "10",
"queries":[
{
"obj":"java.lang:type=Memory",
"attr":[
"HeapMemoryUsage",
"NonHeapMemoryUsage"
],
"resultAlias":"jvmMemory",
"outputWriters":[
{
"@class":"com.googlecode.jmxtrans.model.output.InfluxDbWriterFactory",
"url":"http://127.0.0.1:8086/",
"username":"admin",
"password":"admin",
"database":"jmxDB",
"tags":{
"application":"demoApp"
}}]}]}]}

Start jmxtrans process:
/usr/local/opt/jmxtrans/bin/jmxtrans ~/Tools/k6_Loadtest/jmxtrans_config/jmxconfig.json
By default, the jmxtrans will collect the jmx metrics defined in JSON config file once per minute, For production monitoring, it is good enough, but for local performance testing, we had better to adjust it to 10 seconds per collection for more granularity. Once it is setup, its time to create grafana dashboard with JMX metrics monitoring together with k6.io test data. It helps a ton to better understand your tests and the application under load

Create a mock services (Optional but highly recommend):

To have a external services being mocked is quite helpful, it will make your life much easier:

save time to find a workable and stable environment;
focus on your own code;
test result is more predictable and repeatable.

Since you are working on a Local env you can fully manage, so it is your choice to use the external mock services( such as Mockoon) or you just comment out some of the code to make your test work, but i would suggest to try to simulate the remote connection as much as possible, since it will help to simulate the threads, memory usage and network connections against real use cases.

In this section, I will create a dummy mock services using docker/Golang and Caddy HTTP server in order to simulate different Rest API HTTP methods/payload/Response time.

The sample code in my github repo for the reference

Preparations:

Install Docker(I am using v3.3.3): https://docs.docker.com/get-docker/
Install Caddy v2 HTTP server: https://caddyserver.com/v2

How to build and run dummy-mock service:

clone the github repo into your local
cd /path/to/target/dir/with/dockerfile
define your own response-GET.json and response-POST.json file
docker build . -t dummymock
docker image
docker run -d --rm -p 9091:80/tcp dummymock
/path/to/caddydir/caddy start (note: make sure current dir has predefined Caddyfile, so caddy will auto load the config file)
Use postman or curl to try the mock services with your HTTP method + customized duration you expect to simulate from mock service, for example: http://localhost:8020/?duration=200

Note:

If you want to support https protocol, you can dig into caddy documentation and config to support https,
by default, it does not support too many json response payload, but if you would like to do so, it is easier to extend by adding to the source(main.go) and re-build it

Fine Tuning your OS (Optional):

Make sure your Desktop or Laptop is not the bottleneck during running your performance test, if that is a case, you may consider to fine tuning your OS first , if nothing works, you may consider to adopt dedicate load generators to help you. with the test, however you are not flexible to do a test, it is a trade-off. Do remember, focus on your code first, no one cares if you do not even care.

Install xk6, the k6 extension modules(Optional):

make sure you have Go installed
You can download binaries that are already compiled for your platform
Extract on your local directory, go to the directory
If you are using MacOS, right click to open with Terminal to grand the permission to run xk6 on your local
Select xk6 extensions you want to try, for example, you want to run your k6 test with csv parser functionality from: https://github.com/szkiba/xk6-csv
run the cmd to build your k6 with extensions you selected : xk6 build --with github.com/szkiba/xk6-csv
it will generate a new k6 in the same folder, and run the test with following cmd:
./k6 run test.js

 //calculate page loading time by resource timing API  
 var perfEntries = performance.getEntries();  
 var end_probe = perfEntries.filter(function(item) {  
   //page load finish request indicator  
   if (item.name.indexOf('/req_url_end') > -1) {  
     console.log(item.name);  
     return true;  
   }  
 });  
 if(end_probe.length > 0) {  
   var end_time = end_probe[0].responseEnd;  
   var start_probe = perfEntries.filter(function(item) {  
     //page load start indicator  
     if (item.name.indexOf('/req_url_start') > -1) {  
       console.log(item.name);  
       return true;  
     }  
   });  
   var start_time = start_probe[0].startTime;  
   var duration = end_time - start_time;  
   console.log(duration);  
 } else {  
   console.log("page loading end indicator is not found, please double check all perf Entries");  
 }

There is a better way to combine User Timing API and Resource Timing API to get accurate page performance, i am using Nightmare APIs as a sample to do the automated page tests, which can help our continuous page performance test process on daily basis:


 var url = "http://www.yourhost.com/abc/def/";  
 var page_complete_idy = 'key_request_name'; //page indicator by resource name  
 var tag = 'ReviewPage';  
 var env_name = 'prod'; 

 const RENDER_TIME_MS = 2000;  

var Nightmare = require('nightmare'),
  nightmare = Nightmare({show: true, switches: {
    'ignore-certificate-errors': true
  }});

nightmare
  .goto(url)
  .wait(function(idy){
    var perfEntries = window.performance.getEntries();
    if (perfEntries.length > 20) {
      return perfEntries.some( function (item) {
        if (item.name.includes(idy)) {
          return true;
        }
      });
    } else {
      return false;
    }
  }, page_complete_idy)
  .evaluate(function(idy){
    var perfEntries = window.performance.getEntries();
    var perf_obj = perfEntries.find(function (item) {
      return item.name.includes(idy)
    });
    if (perf_obj) {
      return perf_obj.responseEnd.toFixed(1);
    } else {
      return 'undefined'
    }
  }, page_complete_idy)
  .end()
  .then( function (duration) {
    console.log(tag + ":" + duration);
  })

蜗牛壳 --Testing Tech Snippets

Tuesday, January 04, 2022

API Performance Testing in k6 during the development phase

General Goal -> Finding the performance bottleneck and regressions by simply...

What I talk about when I talk about performance

Why Local Performance test? (AKA, Unit test for API Performance)

K6 Local Env Setup:

Create a Simple API Test script using javascript and k6 lib:

After all, we want to smell our own API, get confidence before you submit your commits and go to prod to monitoring your API with something

Test result visualization:

Monitoring Your (Java) application:

Create a mock services (Optional but highly recommend):

Fine Tuning your OS (Optional):

Install xk6, the k6 extension modules(Optional):

Wednesday, February 20, 2019

How to List all the JVM options defaults

Friday, May 18, 2018

How to check your linux host memory consumption

Tuesday, March 27, 2018

When you met "java.lang.OutOfMemoryError: unable to create new native thread..."

Thursday, January 04, 2018

Good posts to share on Detecting Connection Leak and How to test connection leak

Wednesday, November 22, 2017

How to find High CPU% Threads in Jboss through JMX Console

Wednesday, October 18, 2017

NMT and Java Native memory leak

Tuesday, March 07, 2017

Questionnaire Template for API Performance Review Process

Monday, November 07, 2016

Make a Performance budgeting chart for measuring page performance

Friday, May 20, 2016

Get Page performance result by Combining User and Resource timing API

About Me

Blog Archive

Followers

Total Pageviews Since 07/2010