From Splunk Wiki
How to anticipate data size evolution
aka. Bucket rotation and Retention
Applies to version: Splunk 4.1.*
In Splunk 4.2 a new notion of volumes allow you to specify space quotas per volume, and per folder. http://docs.splunk.com/Documentation/Splunk/latest/Indexer/HowSplunkstoresindexes
What's an index
First of all, you have to know that every index has its own locations and settings. The maximum size of your data on a volume will be the sum of all the indexes stored on this volume. The default index is called "main", but you may have created others, or specific apps may have their own. Splunk also uses its own indexes (_internal, _audit, history, summary...)
What's a bucket
Example of default path decomposition of the main index: $SPLUNK_HOME/var/lib/splunk/db/defaultdb/colddb/db_1288199229_1280423250_10/rawdata/94376288.gz
- defaultdb is the index
- colddb is the location for the buckets in the cold stage (also called the database location)
- db_<epochtimeofthemostrecentevent>_<epochtimeoftheoldestevent>_* is a specific bucket directory (the numbers are the epoch timestamps of the first and the last event in the bucket)
A buckets rolls from one stage to another depending on certain conditions: Hot -> Warm -> Cold -> Frozen (-> Thawed)
- From hot to warm if its size reaches a limit `maxDataSize` or its lifetime is older than `maxHotSpanSecs`, or by using a manual command to roll the buckets.
- From warm to cold; once the number of maxWarmDBCount is reached, the older will be rolled.
- From cold to frozen, once the `maxTotalDataSizeMB` is reached (for hot+warm+cold) or once the `frozenTimePeriodInSecs` reached. Those buckets will be deleted, unless you defined a `coldToFrozenScript` to archive them somewhere.
The different stages of an index may all have a specific location; this is how you can spread your data on different volumes.
- `homePath` location for the Hot and Warm buckets
- Hot (intensive read and write, this is where the indexing occurs)
- Warm (mostly read, and optimization)
- `coldPath` location for the Cold buckets (moved once, then read, used for searches only)
- `thawedPath` location for Thawed buckets (used only if you want to re-import frozen buckets)
- There is no Frozen location defined in Splunk, because the default action is to delete them.
A recommended setup is to define `homePath` on your local high speed read+write raid1+0 disks, to define `coldPath` on the slower disks with a good read speed (RAID5) or remote volumes, and if you defined a `coldToFrozenScript`, to move the frozen buckets on compressed backup tapes. Be aware that the performances of Splunk will depend on the performance of the storage:
- indexing performance: linked to the write speed on the `homePath` location
- search performance: linked to the read speed on the `homePath` and `coldPath` location.
For example: doing searches over long periods, using old data stored on remote volumes will be slower than doing a specific search on recent events on the local high speed volumes.
Example with the default settings
Now let's see the size you should reserve for the `main` index. See the default configuration file $SPLUNK_HOME/etc/system/default/indexes.conf
# global parameters maxDataSize = auto # 750Mb for auto, 10GB for auto_high_volume maxWarmDBCount = 300 maxHotSpanSecs = 7776000 # after this time, a hot bucket will be rolled to warm frozenTimePeriodInSecs = 188697600 maxTotalDataSizeMB = 500000 coldToFrozenScript = # no script by default [main] homePath = $SPLUNK_DB/defaultdb/db coldPath = $SPLUNK_DB/defaultdb/colddb thawedPath = $SPLUNK_DB/defaultdb/thaweddb maxMemMB = 20 maxConcurrentOptimizes = 6 maxHotIdleSecs = 86400 maxHotBuckets = 10 maxDataSize = auto_high_volume # auto = 750Mb, auto_high_volume = 10GB on 64bit volumes, 1GB on 32bit volumes
Space taken by the whole index
The maximum size of all buckets for an index is `maxTotalDataSizeMB`. This is the size of hot+warm+cold. If you have hot+warm and cold on different volumes, try to maximize the use of the space.
Space taken by the hot+warm buckets
The maximum size for the hot+warm of the main index will be:
- (maxWarmDBCount + maxHotBuckets ) * maxDataSize
- (300 + 10 )*750MB = 227GB for auto
- (300 + 10 )*10GB = 3100GB for auto_high_volume
Space taken by the cold buckets
Therefore the maximum size of the cold buckets will be:
- maxTotalDataSizeMB - "size of the hot+warm+cold buckets"
- 500GB - 227GB = 273GB for auto
- 500GB - 3100GB = - 2600GB for auto_high_volume, you probably will never have any cold buckets !!!
Space taken by the frozen Buckets
In the same time, the buckets with all events older than `frozenTimePeriodInSecs` ~6 years are removed from warm and cold and deleted or archived out of Splunk. But because of the previous setting, there will probably be no bucket reaching this frozen state.
Stop splunk if remaining size is too small
# defined in server.conf minFreeSpace = 2000 # in MB by default
It's good to stop indexing if the size on the warm+hot volume is too small. If splunk is installed on windows C: it's a good idea to increase this value to at least RAM*2.
Beware, this is for the volumes where the indexes are located. If your indexes are out of splunk folder, the size limit won't apply to your $SPLUNK_HOME/var/run/splunk (that may be bigger depending of your scheduled searches)
Voila! You should have enough elements to choose and configure your Splunk bucket policy.
Remember to redefine your own configuration in $SPLUNK_HOME/etc/system/local/ instead of touching the default files. And look into $SPLUNK_HOME/etc/system/README/ for configuration examples and explanations
What can go wrong
Here are some examples of what can happen if you don't have the appropriate settings for your retention policy.
- Old events are deleted before they reach your frozenTimePeriodInSecs settings
if maxTotalDataSizeMB is too small compared to your retention time policy, it will cause your old data to be frozen to free up space. Estimate the space you need per index compared to the volume you are collecting every day. Remember: your retention policy relies on frozenTimePeriodInSecs and maxTotalDataSizeMB, and any volume limits.
- The cold bucket is set up to be on a second partition, but no buckets have been rolled to cold, and the hot+warm partition is full, stopping Splunk.
Bad settings for the max number of hot and warm buckets or bad bucket size, (too many hot+warm buckets for your partition) may cause your buckets to never go to the cold location, and to fill up your hot+warm location, and stop Splunk.
- With a very short retention period of one day, some old events are still searchable.
Depending on the max size of your buckets, you may have buckets not filled with events overlapping the retention period. You can use maxHotSpanSecs to force the buckets to keep up to 25 hours of data each.