Community:BreakingSourceCodeExample
From Splunk Wiki
Suppose you want to shove all of your source code into Splunk, so you can search your code. By default Splunk will break unknown source types on timestamps. C++ and Java source types obviously do not have timestamps! You also don't want to process each source line as a separate event. Below is the configuration to have Splunk break events whenever a function closes. In Java and C++ this is generally done with a "}" near the beginning of a line. In C++ as the first character, generally, and Java within a few characters.
Step 1) In props.conf, let's tell Splunk that files that end in c, cc, cpp, cxx, etc., are C++ files; and that files that end in .java are java files. Note the four periods "....". The first 3 are an ellipse (...) saying match on anything to the left. The fourth period is the actual extension ".". Think of it as "..." + ".cpp"
[source::....(c|cc|cpp|cs|cxx|h|hh|hpp|hxx)]
sourcetype = c++
[source::....(java)]
sourcetype = java
Step 2) Still in props.conf, let's define the behavior of the source types c++ and java.
##############################################################################
####### ADDS REASONABLY GOOD SUPPORT FOR C++ AND JAVA BY BREAKING FUNCTIONS,
####### CLASSES, STRUCTS INTO SEPARATE EVENTS
####### -- david@splunk.com
##############################################################################
[c++]
# -- break at end of functions, classes, and structs
MUST_BREAK_AFTER = ^}
BREAK_ONLY_BEFORE = gooblygook
# -- allow insanely long functions
MAX_EVENTS=4000
# no timestamps in line, use the file modification time
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
# don't learn what c++ files look like, we'll simply classify them by extension
LEARN_MODEL = false
[java]
BREAK_ONLY_BEFORE = gooblygook
# break at end of classes or methods on level deep in class, assuming indented one tab or 4 spaces
MUST_BREAK_AFTER = ^(?:\t| {0,4})}
# -- allow insanely long functions
MAX_EVENTS=4000
# no timestamps in line, use the file modification time
DATETIME_CONFIG = NONE
CHECK_METHOD = modtime
# don't learn what c++ files look like, we'll simply classify them by extension
LEARN_MODEL = false
The result of the below configuration is that all lines within a function, method, struct, and class, are put together as a single event.
You can then search for things like "what are my longest functions?":
sourcetype=java | sort -linecount