Wednesday, October 18, 2017

NMT and Java Native memory leak

Java application process used memory usually include JVM Heap, non-heap(PermGen/Metaspace) and Native code which including JVM internals and native OS libs. As we noticed our Physical memory is out of memory after application running a while, however, the Heap usage is fine and normal, using TOP cmd, we found it has eaten almost all the physical memoy and even much bigger than -Xmx heap size we assign to the heap , the first thought come into my mind is maybe Native Memory come into trouble...
But how to make a conclusion to figure point to Native memory?
using -XX:NativeMemoryTracking=summary to help (after JDK7_40?)
after you add above option into your JVM startup config file, first to make sure you have a root permission or switch to root user to run following cmd, for example:
sudo -u {UID} /opt/java/bin/jcmd {PID} VM.native_memory baseline

after the testing running for a while, run another cmd to show your difference comparing with baseline:
sudo -u {UID} /opt/java/bin/jcmd {PID} VM.native_memory summary.diff

PS. if you do not using sudo -u {UID}, you may get exceptions like following:

java.io.IOException: Operation not permitted

or
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
This could give you some high level idea if you have the Native memory leak or not, but which Object brings you trouble ,  Since NMT doesn't track memory allocations by non-JVM code,  you can use jemalloc / pmap to detect memory leaks in native code, few good posts for your reference : http://jenshadlich.blogspot.com/2016/08/find-native-memory-leaks-in-java.html
or http://lysu.github.io/blog/2015/02/02/how-to-deal-with-non-heap-or-native-memory-leak/

Tuesday, March 07, 2017

Questionnaire Template for API Performance Review Process


  • API Owner: Product Owner, Dev Lead and QA Lead
  • Release Target Date
  • Business Impact(GMS or Save Cost), Impacted flows and User types
  • How much traffic expected During Peak hour(TPS/TPM/TPH)
  • API Name/EndPoint & Method/Sample Request & Response?
  • API priority based on its traffic and importance
  • High level Design and Workflow diagram for the API or API dependencies
  • How many Roles get deployed, and their .war names/versions
  • Existing or New API, If an existing API, any Monitoring Dashboard and Baseline captured against PROD?
  • JDBC queries
  • Third parties dependencies and End points
  • Firewall Ruleset/Gateway Synapse changes
  • Project Wiki page Link
  • Dev API testing plan/scripts/results for reference

Monday, November 07, 2016

Make a Performance budgeting chart for measuring page performance

You should have a performance goal before you measure your page performance, i would suggest you had better have a performance budget for each component, then fight against the one who has overdrawn.
Using navigation timing API , User Timing API and Resource Timing API to do the measurement, in both synthetic and RUM way!!

Besides measure Duration of each Component as a Main KPI, The Content Downloaded Size, # of Requests for each Component need to be considered meanwhile. 


















Another point of view to explain the key page performance metrics, please remember every page design differently, you can not set up a rule to fit all, considering from end user perspective is always a great start:



Friday, May 20, 2016

Get Page performance result by Combining User and Resource timing API

 //calculate page loading time by resource timing API  
 var perfEntries = performance.getEntries();  
 var end_probe = perfEntries.filter(function(item) {  
   //page load finish request indicator  
   if (item.name.indexOf('/req_url_end') > -1) {  
     console.log(item.name);  
     return true;  
   }  
 });  
 if(end_probe.length > 0) {  
   var end_time = end_probe[0].responseEnd;  
   var start_probe = perfEntries.filter(function(item) {  
     //page load start indicator  
     if (item.name.indexOf('/req_url_start') > -1) {  
       console.log(item.name);  
       return true;  
     }  
   });  
   var start_time = start_probe[0].startTime;  
   var duration = end_time - start_time;  
   console.log(duration);  
 } else {  
   console.log("page loading end indicator is not found, please double check all perf Entries");  
 }  
There is a better way to combine User Timing API and Resource Timing API to get accurate page performance, i am using Nightmare APIs as a sample to do the automated page tests, which can help our continuous page performance test process on daily basis:

 var url = "http://www.yourhost.com/abc/def/";  
 var page_complete_idy = 'key_request_name'; //page indicator by resource name  
 var tag = 'ReviewPage';  
 var env_name = 'prod'; 

 const RENDER_TIME_MS = 2000;  

var Nightmare = require('nightmare'),
  nightmare = Nightmare({show: true, switches: {
    'ignore-certificate-errors': true
  }});

nightmare
  .goto(url)
  .wait(function(idy){
    var perfEntries = window.performance.getEntries();
    if (perfEntries.length > 20) {
      return perfEntries.some( function (item) {
        if (item.name.includes(idy)) {
          return true;
        }
      });
    } else {
      return false;
    }
  }, page_complete_idy)
  .evaluate(function(idy){
    var perfEntries = window.performance.getEntries();
    var perf_obj = perfEntries.find(function (item) {
      return item.name.includes(idy)
    });
    if (perf_obj) {
      return perf_obj.responseEnd.toFixed(1);
    } else {
      return 'undefined'
    }
  }, page_complete_idy)
  .end()
  .then( function (duration) {
    console.log(tag + ":" + duration);
  })

Thursday, May 14, 2015

Replaying Your [access] log by JMeter


Replaying [apache access] log by JMeter to mimic the real user load..
This is inspired by blazemeter post: Learn How to Replay Your Production Traffic With JMeter, but I made my own optimization and enhancement, check it out if you are interested in it:

https://github.com/joychester/Doraemon

PS: the first row in the formatted log file will be ignored, due to fetch the log started timestamp.

Wednesday, February 11, 2015

Upgrade Ruby version on your Mac OS X

Finally I got a 15'' Mac Pro as my working laptop, it is kind of Chinese new year gift :)
First, the OS embedded a Ruby 2.0 version, which is kind of out of date, so upgrade to the 2.2.0 is my first stuff to do with my Mac.
The main steps i follow is This
However, after doing that, it is not working properly on mine, so i will simplify my steps as following:
  1. Install Homebrew: ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
  2. Install rbenv to help you manage ruby versions: brew install rbenv ruby-build
  3. Adding extra Path to ~/.bash_profile : export PATH=/Users/cchi/.rbenv/shims:$PATH
  4. Open Terminal-> preference -> Shell -> Startup : source ~/.bash_profile
  5. Check all existing package you can install : rbenv list -l
  6. Install Ruby 2.2.0: rbenv install 2.2.0
  7. Sets the global version to Ruby 2.2.0: rbenv global 2.2.0
  8. Check the ruby Version: ruby -v
Start to Taste your Ruby and Mac!!


Update: i found this link which is great steps and explanation to install  Ruby by rbenv:
https://cbednarski.com/articles/installing-ruby/

Thursday, January 08, 2015

Scaffold Code on my github based on Structuring Sinatra Web Application



It is Sinatra module based and more structuring(or clean) than our previous one of my indoor project based on classical style code when I wrote it in 2013, so that’s why I rewrite the code and make it as a “framework”

Check it out from my github

A demo app using jquery, highcharts, bootstrap and Sinatra module based code, to show you how to organize the code and the folder structure to use this "framework"

Thanks to Inspired from:
Structuring Sinatra Applications
Structuring Sinatra Apps 

Online Editor c9.io 

 if you are using c9.io services as your IDE, you can easily take my following scripts to grab my code:
 require 'git'  
 require 'fileutils'  
 require 'sys/proctable'  
 $: << File.expand_path(File.dirname(__FILE__))  
 git_repo = 'https://github.com/joychester/Arowana.git'  
 target_dir = './arowana'  
 if ! Dir.exist?(target_dir)  
   g = Git.clone(git_repo, target_dir)  
 else  
   FileUtils.remove_entry(target_dir)  
   g = Git.clone(git_repo, target_dir)  
 end  
 # exec 'bundle install and rackup config.ru'  
 Dir.chdir('./arowana') do  
   `bundle install`  
   # check postgresql service if running  
   pg_service = Sys::ProcTable.ps.select { |process|  
     process.include?('postgres')  
   }  
   if pg_service.empty?  
     p 'please check your postgresql service if it is running, exiting...'  
     exit(1)  
   else  
     p 'ready to start your Arowana App'  
     `rackup config.ru -p $PORT -o $IP`  
   end  
 end  

Sunday, December 28, 2014

Automated WebPageTest using "snowboard"

I have pushed my project code "snowboard" to my github and check it out if you want to see if it is helpful or not for your daily Synthetic Front-End Performance Test:
https://github.com/joychester/snowboard

Thanks to Webpagetest, from now on, you can request your own API key from : http://www.webpagetest.org/getkey.php

you can freely write your own dashboard or store the whole thing to MongoDB or PostgreSQL etc,  for page trending and further analysis, or you can define your own page perception time by filmstrip which is an existing stage to redefine the page load time for so dynamic web pages.


Monday, December 15, 2014

HTTP1.0 and HTTP1.1 Performance with KeepAlive enabled


The recent misconfiguration to the ssl.conf of apache gives me the chance to test the HTTP1.1 and HTTP1.0 performance difference with KeepAlive ON, actually it stays there for years...

Pic1: Shows the HTTP1.1 with Keepalive ON performance overtime, stable and fast:













Pic2: Shows the HTTP1.0 with Keepalive ON performance overtime, up and down:













Current settings in ssl.conf, which makes all IE user agent use HTTP 1.0 as a response protocol:

SetEnvIf User-Agent ".*MSIE.*" \
         nokeepalive ssl-unclean-shutdown \
         downgrade-1.0 force-response-1.0

To fix the issue, just bypass IE1-6 which may have issues instead of applying to all IE user-agent (it is said to be fixed by latest apache version already):

SetEnvIf User-Agent ".*MSIE [1-6].*" \
         nokeepalive ssl-unclean-shutdown \
         downgrade-1.0 force-response-1.0

PS: Also tested when turn Keepalive to Off , the response time between HTTP1.0 and HTTP1.1 is similar, but 3-4 times slower than keepalive settings for sure due to handshake..