Deploy:BucketRotationAndRetention

From Splunk Wiki

Jump to: navigation, search

How to anticipate data size evolution

aka. Bucket rotation and Retention

Applies to version: Splunk 4.1.*

In Splunk 4.2 a new notion of volumes allow you to specify space quotas per volume, and per folder. http://docs.splunk.com/Documentation/Splunk/latest/Indexer/HowSplunkstoresindexes


What's an index

First of all, you have to know that every index has its own locations and settings. The maximum size of your data on a volume will be the sum of all the indexes stored on this volume. The default index is called "main", but you may have created others, or specific apps may have their own. Splunk also uses its own indexes (_internal, _audit, history, summary...)

What's a bucket

Secondly, a bucket is a unit of indexed data. It is physically a directory containing events of a certain period. You may have several buckets at the same time in each stage. See details here: [[1]]

Example of default path decomposition of the main index: $SPLUNK_HOME/var/lib/splunk/db/defaultdb/colddb/db_1288199229_1280423250_10/rawdata/94376288.gz

  • defaultdb is the name of the index
  • colddb is the location for the buckets in the cold stage (also called the database location)
  • db_<epochtimeofthemostrecentevent>_<epochtimeoftheoldestevent>_* is a specific bucket directory (the numbers are the epoch timestamps of the first and the last event in the bucket)

Bucket stages

A buckets rolls from one stage to another depending on certain conditions: Hot -> Warm -> Cold -> Frozen (-> Thawed)

  • From hot to warm if its size reaches a limit `maxDataSize` or its lifetime is older than `maxHotSpanSecs`, or by using a manual command to roll the buckets.
  • From warm to cold; once the number of maxWarmDBCount is reached, the older will be rolled.
  • From cold to frozen, once the `maxTotalDataSizeMB` is reached (for hot+warm+cold) or once the `frozenTimePeriodInSecs` reached. Those buckets will be deleted, unless you defined a `coldToFrozenScript` to archive them somewhere.

Bucket location

The different stages of an index may all have a specific location; this is how you can spread your data on different volumes.

  • `homePath` location for the Hot and Warm buckets
    • Hot (intensive read and write, this is where the indexing occurs)
    • Warm (mostly read, and optimization)
  • `coldPath` location for the Cold buckets (moved once, then read, used for searches only)
  • `thawedPath` location for Thawed buckets (used only if you want to re-import frozen buckets)
  • There is no Frozen location defined in Splunk, because the default action is to delete them.

Recommendations

A recommended setup is to define `homePath` on your local high speed read+write raid1+0 disks, to define `coldPath` on the slower disks with a good read speed (RAID5) or remote volumes, and if you defined a `coldToFrozenScript`, to move the frozen buckets on compressed backup tapes. Be aware that the performances of Splunk will depend on the performance of the storage:

  • indexing performance: linked to the write speed on the `homePath` location
  • search performance: linked to the read speed on the `homePath` and `coldPath` location.

For example: doing searches over long periods, using old data stored on remote volumes will be slower than doing a specific search on recent events on the local high speed volumes.

Example with the default settings

Now let's see the size you should reserve for the `main` index. See the default configuration file $SPLUNK_HOME/etc/system/default/indexes.conf

# global parameters
# 750Mb for auto, 10GB for auto_high_volume 
maxDataSize = auto
maxWarmDBCount = 300
# after this time, a hot bucket will be rolled to warm
maxHotSpanSecs = 7776000
frozenTimePeriodInSecs = 188697600
maxTotalDataSizeMB = 500000
# no script by default
coldToFrozenScript = 

[main]
homePath   = $SPLUNK_DB/defaultdb/db
coldPath   = $SPLUNK_DB/defaultdb/colddb
thawedPath = $SPLUNK_DB/defaultdb/thaweddb
maxMemMB = 20
maxConcurrentOptimizes = 6
maxHotIdleSecs = 86400
maxHotBuckets = 10
# auto = 750Mb, auto_high_volume = 10GB on 64bit volumes, 1GB on 32bit volumes
maxDataSize = auto_high_volume 

Space taken by the whole index

The maximum size of all buckets for an index is `maxTotalDataSizeMB`. This is the size of hot+warm+cold. If you have hot+warm and cold on different volumes, try to maximize the use of the space.

Space taken by the hot+warm buckets

In the worse case, with only large buckets, the maximum size for the hot+warm of the main index will be:

  • (maxWarmDBCount + maxHotBuckets ) * maxDataSize
  • (300 + 10 )*750MB = 227GB for auto
  • (300 + 10 )*10GB = 3100GB for auto_high_volume

Space taken by the cold buckets

Therefore with only large buckets, the maximum size of the cold buckets will be:

  • maxTotalDataSizeMB - "size of the hot+warm buckets"
  • 500GB - 227GB = 273GB for auto
  • 500GB - 3100GB = - 2600GB for auto_high_volume, you probably will never have any cold buckets !!!

Space taken by the frozen Buckets

In the same time, the buckets with all events older than `frozenTimePeriodInSecs` ~6 years are removed from warm and cold and deleted or archived out of Splunk. But because of the previous setting, there will probably be no bucket reaching this frozen state.

Stop splunk if remaining size is too small

 
# defined in  server.conf
# in MB by default
minFreeSpace = 2000

It's good to stop indexing if the size on the warm+hot volume is too small. If splunk is installed on windows C: it's a good idea to increase this value to at least RAM*2.

Beware, this is for the volumes where the indexes are located. If your indexes are out of splunk folder, the size limit won't apply to your $SPLUNK_HOME/var/run/splunk (that may be bigger depending of your scheduled searches)


Voila! You should have enough elements to choose and configure your Splunk bucket policy.

Remember to redefine your own configuration in $SPLUNK_HOME/etc/system/local/ instead of touching the default files. And look into $SPLUNK_HOME/etc/system/README/ for configuration examples and explanations

What can go wrong

Here are some examples of what can happen if you don't have the appropriate settings for your retention policy.

  • Old events are deleted before they reach your frozenTimePeriodInSecs settings

if maxTotalDataSizeMB is too small compared to your retention time policy, it will cause your old data to be frozen to free up space. Estimate the space you need per index compared to the volume you are collecting every day. Remember: your retention policy relies on frozenTimePeriodInSecs and maxTotalDataSizeMB, and any volume limits.

  • The cold bucket is set up to be on a second partition, but no buckets have been rolled to cold, and the hot+warm partition is full, stopping Splunk.

Bad settings for the max number of hot and warm buckets or bad bucket size, (too many hot+warm buckets for your partition) may cause your buckets to never go to the cold location, and to fill up your hot+warm location, and stop Splunk.

  • With a very short retention period of one day, some old events are still searchable.

Depending on the max size of your buckets, you may have buckets not filled with events overlapping the retention period. You can use maxHotSpanSecs to force the buckets to keep up to 25 hours of data each.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk