Community:BestPracticesForBackingUp

From Splunk Wiki

Jump to: navigation, search

< Back to Best Practices

Update for 6.x

The safest way to ensure a bucket roll is to perform a Splunk restart. Per the comment below, this is only applicable for low volume environments where the tolerance for data loss does not mesh with how frequently buckets will roll naturally. For added protection against data loss, consider Splunk's built in Index Replication.

Update for 4.0

The command to roll a bucket in Splunk 4.0.3 and later is: | debug cmd=roll index=index_name

The sections following this one pertain to Splunk 3.x. In 4.0, because of some architectural changes with multiple hot buckets, you can't force a bucket roll and so it has become harder to backup the data residing in hot buckets. If you have enough data that Splunk rolls often - more than 10GB a day - then just back up warm. Optimally, you need to snapshot hot and back up from the snapshot. Solaris ZFS and Windows VSS do filesystem snapshots that are appropriate for this. (And on Windows/NTFS, a decent backup utility will use VSS automatically or at least easily). There are many backup utilities that don’t though - most common Linux filesystems (ext, reiser, jfs) don’t do snapshots, however Linux's Logical Volume Manager (LVM) can make a snapshot of an entire volume. Expensive storage systems like EMCs and Hitachis will do snapshots at the storage level too. If you don’t have this capability and have low data volume, yes, you have some unbackable data. We are working on a solution for this.

Best practices to backup your Splunk index

This topic discusses some considerations for planning your Splunk index backup strategy. It first gives an overview of how your indexed data moves through Splunk, then makes recommendations for planning your backup strategy based on this information.

For specific details on changing the default values mentioned in this topic, refer to this topic in the Admin Manual.

How data moves through Splunk

When Splunk is indexing, the data moves through a series of stages based on policies that you define. At a high level, the default behavior is as follows:

When data is first indexed, it is put into the "hot" database, also known as the "hot db".

The data remains in the hot db until the policy conditions are met for it to be reclassified as "warm" data. This is called "rolling" the data into the warm db. By default, this happens when the hot db reaches a specified size, but you can set up a saved search to force it to happen on a schedule, or better still, write a script to force it to happen on a schedule from the Splunk CLI. Some details on doing this are given a little later in this topic.

When the hot db is rolled, its directory is renamed to be a "bucket" in the warm db, and a new hot db is created immediately to receive the new data being indexed. At this point, it is safe to back up the warm db buckets.

Next, when you get to a specified number of warm buckets (the default value is 300 buckets), buckets are renamed to be cold buckets to maintain 300 warm buckets. (If your cold db is located on another fileshare, the warm buckets are moved to it and then deleted from the warm db directory.) Be aware that the more warm buckets you have, the more places Splunk has to look to execute searches, so adjust this setting accordingly.

Finally, when your data meets the policy requirements defined, it is "frozen". The default setting for this is to delete them. If you need to save data indefinitely, you must change this setting.

Summary:

  • hot db: being written to currently, non-incrementally changing; don't back this up, back up the warm db instead
  • warm db: being added to incrementally, can be safely backed up, made up of multiple warm 'buckets'
  • cold db: based on your policy (default 300 buckets), when you get to that many buckets, buckets are either renamed (like from hot) or copied (if on another filesystem) to cold (and deleted from the warm directory)
  • frozen: default policy is to delete.

Choose your backup strategy

The general recommendation is to schedule backups of your warm db buckets regularly using the incremental backup utility of your choice.

Splunk's default policy is to roll the hot db to the warm db based on the policy you define. By default, this policy is set to roll the data when your hot db reaches a certain size. If your indexing volume is fairly low, Splunk's default 'rolling' policy means that your hot db will be rolled to warm very infrequently. This means that if you experience a disk failure in which you lose your hot db, you could lose a lot of un-backed-up data from your hot db.

If you're concerned about losing data in this fashion, you can configure Splunk to force a roll from hot to warm on whatever schedule you're comfortable with. and then schedule your backup utility to back up the warm db immediately after that. You should note, however, that if you roll too frequently, you might experience a degradation in search speed, as well as use more disk space than you otherwise would. Every time data is rolled from hot to warm, a new 'bucket' is created, which means that searches have to look in more buckets to see all the data. As a result, Splunk recommends that you roll no more frequently than once a day. Tune this to suit your particular data retention, search performance, and backup needs.

If your environment requires that you back up more than once a day, you can deploy Splunk in an HA configuration where forwarders are configured to send all your data to two different Splunk indexers, and use the second one as your hot backup.

Rolling from the CLI

You can use the following syntax to force a roll of the hot db to warm:

./splunk search '| oldsearch !++cmd++::roll' -auth splunk

This will roll the default index, which is typically main.

You can specify an index to be rolled like this:

./splunk search ' | oldsearch index=_internal !++cmd++::roll' -auth admin:changeme

You'll always see an error about "Search Execute failed because Hot db rolled out to warm" right afterwards; you can safely ignore it. You'll also need to provide the admin password to execute this CLI command.

If you want to roll more than one index, you have to do them each separately. To list out your indexes, use ./splunk list index

Recommendations for recovery

If you experience a non-catastrophic disk failure (for example you still have some of your data, but Splunk won't run), Splunk recommends that you move the index directory aside and restore from the backup rather than restoring on top of a partially corrupted datastore. Splunk will automatically create the db-hot directories on startup and resume indexing. Monitored files and directories will pick up where they were at the time of the backup.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk