Community:Collecting Firewall Data Summary Indexing
From Splunk Wiki
If you are dealing with large amounts of firewall data, you need to think about setting up summary indexing. As an example, we have seen environments generating billions of firewall log records per month. To scale to that number of events, you need to use summary indexing as described here:
- Setup saved searches that use the summary index feature to collect aggregated information on a periodic basis.
- For statistical accuracy, you should always collect a larger sample than what your target summary is going to show. For example, if you are looking for the top 10 sources, make sure you are collecting the top, for example, 100 sources in the searches populating your summary index.
- Setup your firewall reports to read from the summary index instead of from the main index to generate aggregate reports, such as the top source addresses.
An end-to end example to report on the top 10 source addresses on a daily basis, seen on the firewall, looks as follows:
- Write a saved search which looks for all firewall messages on an hourly basis:
- Save the search. Enable the search to be run on an hourly basis.
- When creating the saved search, also enable summary indexing to store the result in the summary index. You do this by adding the following to your search:
- Setup your firewall report to query the summary index and generate your aggregate report:
- Another example would be:
eventtypetag=firewall | stats count by src_ip | sort count desc | head 100
| collect index=summary marker="report=\"firewall top100 sources hourly\"" addtime=T
index=summary report=firewall_top100_sources_hourly daysago=1 | chart sum(count) by src_ip
index=summary report=firewall_top100_sources_hourly daysago=1 | timechart span=1h sum(count) by src_ip