Community:Monitoring JVMs

From Splunk Wiki

Jump to: navigation, search

HOWTO: Index your JVM garbage collection data

Most of us have been in situations where we have to debug performance problems in Java based applications based on little more information than thread dumps and the garbage collection logs. With the many tunable parameters that the newer JVMs offer, its easy to get the garbage collection configuration wrong for your environment. Fortunately, the latest JVMs offer ergonomics, where less is more - you specify a few parameters for tuning, and let the JVM figure out the rest.

If you invoke the Sun JVM with the -Xloggc:logfile parameter or the IBM JVM with the -Xverbosegclog parameter, the Garbage Collector (GC) will faithfully write out what its doing to the named logfile. Each line in the log file corresponds to a GC operation. There are two kinds of lines, one for partial GC and one for full GC.

(the text above was written by Sujit Pal)

Changing your JVM startup options

You need to add the following arguments to your JVM's startup parameters to have the data be generated with timestamps. Change the logging path and filename to suit your needs.

Sun JVM:
-Xloggc:C:\MyJVM\jvm.log -verbose:gc -XX:+PrintGCDateStamps

IBM JVM:
-Xverbosegclog:C:\MyJVM\jvm.log

Configuring Splunk

Make sure you create an file input to capture the JVM log file mentioned above. You also need to set its sourcetype to "sun_jvm" or "ibm_jvm".

You then need to change your props.conf and transforms.conf to contain the following lines:

Sun JVM

props.conf
[sun_jvm]
AUTO_LINEMERGE=FALSE
SHOULD_LINEMERGE=TRUE
DATETIME_CONFIG=CURRENT
BREAK_ONLY_BEFORE=\d+\.\d+:
REPORT-jvm = sun_jvm_gc

transforms.conf
[sun_jvm_gc]
REGEX = \[(Full\s)?GC\s(?<JVM_HeapUsedBeforeGC>\d+)K->(?<JVM_HeapUsedAfterGC>\d+)K\((?<JVM_HeapSize>\d+)K\),\s(?<JVM_GCTimeTaken>\d+.\d+)\ssecs\]

IBM JVM

props.conf
[ibm_jvm]
SHOULD_LINEMERGE=TRUE
BREAK_ONLY_BEFORE=<af\s

NOTE: You may need to adjust the regular expression if your data looks a bit different.

You then need to either restart Splunk, or just have it reload the new config by running this search: * | head 1 | kv reload=t

Visualizing the data

Now that Splunk is correctly indexing your JVM logs you can create dashboards that show your GC details.

The following dashboard is created using these 2 searches:
sourcetype=sun_jvm | timechart avg(JVM_GCTimeTaken)
sourcetype=sun_jvm | timechart avg(JVM_HeapSize) avg(JVM_HeapUsedAfterGC) avg(JVM_HeapUsedBeforeGC)

If you're using IBM's JVM, try:
sourcetype=ibm_jvm | timechart avg(totalms)

Jvm.png

Handling large data volumes

If your JVM is under heavy load or is just chatty, you could end up with a large log file which in turn could lead to slower searches for your dashboard. In that case its recommended you use summary indexing.

Food for thought

Now that you're indexing your JVM data you can also leverage all the other cool stuff Splunk provides. You could create an alert that emails you when a JVM is misbehaving or even automate the way you deal with the JVM by doing things like automatic restarts, etc. The uses are endless! :)

Feedback?

I'd love to hear from you. Email me at simon at splunk dot com.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk