蜗牛壳 --Testing Tech Snippets: November 2011

Monday, November 28, 2011

Dealing with "Not enough storage is available to process this command" exception by PsExec

When I am using Btrace to profiling JVM started with JavaServiceWrapper, I got such an annoyed exception:

CMD> C:\btrace\bin>btrace 3272 Test.java

Not enough storage is available to process this command

1> Login as a admin user anyway (remotely or locally)
2> Download PsTools which includes PsExec.exe , and "PsExec is a light-weight telnet-replacement that lets you execute processes on other systems, complete with full interactivity for console applications, without having to manually install client software", or you can download directly from : http://technet.microsoft.com/en-us/sysinternals/bb897553
3> Get your target PID(java.exe) with processExplorer.exe, download from: http://technet.microsoft.com/en-us/sysinternals/bb896653
4> Then Run your command line again using PsExec help:

CMD> D:\ProcessExplorer\PsTools>PsExec.exe -s "C:\btrace\bin\btrace.bat" 3272 C:\Test.java

Not only for Btrace, the other external tools can also be run like this way to avoid such a wired Error...

Monday, November 21, 2011

如何计算消息队列中的消息处理时间

求某一时刻进入队列的消息，计算其出队列时间 EL?

定义：

S 为入队列速率（根据load的情况，S随不同时间段而变化）

P 为出队列吞吐量，通常在给定的系统配置条件下可以认为是常量

TK指K时刻

TN是指对应K时刻进入消息队列的消息出队列所到达的时刻

QD 为初始时刻Queue中剩余的Message的数量

情况1> 假设在一定时间段内，S为定值

当 S 总是<=P时，在一段时间后任意时刻均无剩余message堆积, 这里暂时不考虑这类问题

当 S > P 时， Queue depth 会不断增长

所以，TN = (S * TK)/P + QD/P
EL＝ TN－TK
EL ＝ (S * TK)/P + QD/P －TK ＝(S/P - 1)*Tk + QD/P

其中，S，P均为固定值，并且可以被measure

（一般可以通过SQL语句来统计S和P的值，例如-- select count(*) from dbo.myqueue where ms.timestamp == ctime）

情况2> 假设S为不断变化的正玄函数（斜率即为某时刻的进队速率S‘），则公式可变换为：
EL = (∫ sin(cx)dx) /P + QD/P -TK , 其中积分的范围是0-TK

因此，S需要不断的采样，利用曲线拟合方法(Curve fitting)，确定它符合什么样的函数，但不管什么情况，都可以套用类似的模型解决此类问题

Remove the value of the param in URL

It should be formatted your URL when doing aggregation for further analysis, here is Ruby code to be copy with the URL which includes params within it:

 string1 = "http://www.google.com.hk/search?sclient=psy-ab&hl=en&newwindow=1&safe=strict&biw=1440&bih=693&noj=1&source=hp&q=wicket+1.5+URL&oq=wicket+1.5+URL&aq=f&aqi=g2&aql=&gs_sm=e&gs_upl=3231l4087l0l4226l4l4l0l0l0l2l302l832l2-2.1l3l0";  
 if (string1.include? "?")  
  string1.insert(-1, "&")  
  string1.gsub!(/=(.*?)&/, "&")  
  puts string1.chomp!("&")  
 end

The output result:

 http://www.google.com.hk/search?sclient&hl&newwindow&safe&biw&bih&noj&source&q&oq&aq&aqi&aql&gs_sm&gs_upl

Tuesday, November 15, 2011

Essential Counters for Performance Monitoring

The goals of performance monitoring i always think about are:

- Health indicator: it will show you if the current status of your application farm is in good shape or not; or let you predict the trend;
- Find out the bottleneck of your system from a high level, it is really helpful for Engineers to get start to get issue fixed;
- Help to do trouble shooting: the counters will drive us to the right direction from different angles to find the root cause instead of "smart guess";
- Operation management： Capacity Planning and Risk management， try to resolve any performance or stability issues before it comes!

Counters For OS		Explaination
	Server Uptime	elapse time since server recent start up
	Processor--Total CPU%	the percentage of elapsed time that the processor spends to execute a non-Idle thread.
	Processor--% User Time	the percentage of elapsed time the processor spends in the user mode.
	System--Processor Queue Length	the number of threads in the processor queue. There is a single queue for processor time even on computers with multiple processors. Therefore, if a computer has multiple processors, you need to divide this value by the number of processors servicing the workload. A sustained processor queue of less than 10 threads per processor is normally acceptable, dependent of the workload.
	Memory--% MEM in use	the ratio of Memory\\Committed Bytes to the Memory\\Commit Limit.
	Memory--pages/sec	the rate at which pages are read from or written to disk to resolve hard page faults.
	Disk Space--DISK C:	the percentage of Used space on Disk C
	Disk Space--DISK D:	the percentage of Used space on Disk D
	Disk Space--DISK E:	the percentage of Used space on Disk E
	DISK IO --Avg. Disk Queue Length	the average number of both read and write requests that were queued for the selected disk during the sample interval.
	DISK IO --% Disk Time	the percentage of elapsed time that the selected disk drive was busy servicing read or write requests.
	DISK IO --Avg. Disk sec/Read	the average time, in seconds, of a read of data from the disk.
	DISK IO --Avg. Disk sec/Write	the average time, in seconds, of a write of data to the disk.
	DISK IO --Disk Reads/sec	the rate of read operations on the disk.
	DISK IO --Disk Writes/sec	the rate of write operations on the disk.
	NetWork IO--Packets Sent/sec	the rate at which packets are sent on the network interface.
	NetWork IO--Packets received/sec	the rate at which packets are received on the network interface.
	TCP--TCP Connections EST	the number of TCP connections for which the current state is either ESTABLISHED or CLOSE-WAIT
	Total Process cnt	Total number of processes on the server

Counters For Specific Process instance		Explaination
Apache Http Server -- Apache Process
	Process Uptime	elapse time since apache process recent start up
	Process CPU%	the percentage of elapsed time that the processor spends on Apache Process.
	Process Memory in use	Memory usage by apache process
	Busy workers cnt	the number of threads which are in use for requests
	Idle workers cnt	the number of threads which are not receiving any request
	requests/sec	throughput of apache Http server
	KB/sec	throughput of apache Http server

Application Server -- Java Process
	Process Uptime	elapse time since java process recent start up
	Process CPU%	the percentage of elapsed time that the processor spends on java Process.
	Private Bytes	the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes.
	Working Set	Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process
	Used Heap Size	JVM Heap size is in use currently
	Live Threads cnt	The number of threads currently active in this process
	Accumulate GCTime	the total time taken by GC activities

Async Server--JMS MQ Process
	Process Uptime	elapse time since JMS process recent start up
	Process CPU%	the percentage of elapsed time that the processor spends on JMS Process.
	Private Bytes	the current size, in bytes, of memory that this process has allocated that cannot be shared with other processes.
	Working Set	Working Set is the current size, in bytes, of the Working Set of this process. The Working Set is the set of memory pages touched recently by the threads in the process
	Message Queue Depth	the current number of messages that are waiting on the queue
	DLQ cnt	the current number of messages which in Dead Letter Queue
	Live Threads cnt	The number of threads currently active in this process
	DB Connections cnt	the current number of open database connections used by this process

DB Server -- SQL Server Process
	Process Uptime	elapse time since SQL Server process recent start up
	Process CPU%	the percentage of elapsed time that the processor spends on SQL Server Process.
	Process Memory in use	Memory usage by SQL Server process
	Free Space in Temp DB	Tracks free space in tempdb in kilobytes
	User Connections	the current number of connections (and users) are using the server
	Buffer Cache Hit Ratio	indicates how often SQL Server goes to the buffer, not the hard disk, to get data. Had better larger than 90%
	Full Scans/Sec	the number of unrestricted full scans during unit time. These can either be base table or full index scans.
	Transactions/Sec	The number of transactions started for the database during unit time
	Average Wait Time	the average amount of wait time (milliseconds) for each lock request that resulted in a wait

More counters you add and higher granularity you made will bring more overhead to your system, you should make trade offs on selecting counters and sample interval.There is no silver bullet after all...

蜗牛壳 --Testing Tech Snippets

Monday, November 28, 2011

Dealing with "Not enough storage is available to process this command" exception by PsExec

Monday, November 21, 2011

如何计算消息队列中的消息处理时间

Remove the value of the param in URL

Tuesday, November 15, 2011

Essential Counters for Performance Monitoring

About Me

Blog Archive

Followers

Total Pageviews Since 07/2010