Community:HighAvailabilityAndSplunk

From Splunk Wiki

Jump to: navigation, search

High availability and Splunk

Splunk supports high availability for your indexed data. You can deploy multiple Splunk servers, each indexing its own copy of data forwarded from Splunk forwarders. This means that if your primary Splunk server goes down, you still have one or more other Splunk servers that have the same data in an index.

For information on how to achieve high availability for your indexers and indexed data, see "High availability reference architecture".


Starting with 4.2, you can use search head pooling to share configurations and user data between search heads. So, the Search Head Pooling with LDAP authentication, you do not need to rsync the apps, users, and dispatch jobs among search heads, anymore. Now it's easy to use a load balancer between the user client browser and the search heads. The load balancer should support layer 7 cookie session persistence.



For pre-4.2, Splunk does not support high availability in the user/search experience by default. Users, password changes, dashboards, saved searches, and other Splunk Web objects created by users on the primary Splunk instance will not propagate automatically to a secondary instance. This means that unless you follow the recommendations described later in this topic, after you fail over to a secondary Splunk instance, users may not see what they're used to seeing on the primary instance. For example, if a user creates a saved search using Splunk Web on the primary Splunk server instance, and then your deployment fails over to the secondary instance, that saved search won't be there. The same thing applies to dashboards, changed user passwords, tags, event types, and other user- or role-specific functionality in Splunk Web. For recommendations on addressing this, read on.

Considerations for Splunk Web objects and user data in high availability

If you are deploying Splunk in a high availability configuration and want to maintain a consistent user experience across instances of Splunk Web when your primary Splunk server goes down, there are two main categories you must consider:

  • user data (new users, roles, passwords, etc.)
  • Splunk Web objects (dashboards, saved searches, etc.)

Splunk highly recommends that you enable LDAP authentication on both your primary and secondary Splunk server instances to manage your user data. This way, password changes, new users, and so on are propagated to both instances automatically.

If you are not using LDAP in your organization, you can use rsync to keep user data files updated on your secondary Splunk instances as they are changed on the primary instance. This is not recommended for user data, but is an option.


The way Splunk Web appears to users is primarily controlled by prefs.conf and savedsearches.conf. You can use rsync to update these files on your secondary Splunk instance when they are changed on the primary instance. Refer to the list below for additional details.

Important: Most changes to Splunk configuration files require that you restart Splunk for the changes to take effect. This means that you must ensure that your secondary Splunk instance is restarted regularly to keep them in sync.

Working with load balancers

Note: Splunk does not recommend using load balancers between user Web clients and Splunk Web, but if you must do this, some customers have reported success using the following guidelines:

  • Configure your HTTP load balancer to always choose the primary Splunk Web instance unless it is unavailable. This is sometimes referred to as a "sticky" load balancer. This way, users will always see the dashboards and saved searches that they expect to see.
  • Use rsync to keep things like saved searches and dashboards consistent across primary and secondary Splunk server instances. Keep in mind that most config file changes require that you restart Splunk, so you will also have to restart the secondary instance Splunk as needed to keep things in sync.
  • Set up LDAP authentication for your user data. This way, you don't have to worry about rsyncing user data.

Load balancers can also help overcome the Splunk LDAP module's limitation of only being able to bind to one LDAP server, thus creating a single point of failure in authentication. Simply create a VIP on a load balancer that front-ends several LDAP servers and use the VIP as the LDAP server in your Splunk LDAP configuration.

Additionally, most load balancers will provide a point of certificate termination, which simplifies configuration wherein secure LDAP is required. Simply upload the domain root CA certificate to the load balancer and associate it with the given VIP and the load balancer should provide termination such that you will not need to import the root CA cert into Splunk to enable secure LDAP.

Files to rsync

To keep your secondary Splunk server instances in sync with the primary instance, you can use rsync. Configure rsync to copy over the following files:

To keep Splunk Web objects such as dashboards, saved searches, and similar objects synchronized across multiple instances:

$SPLUNK_HOME/etc/apps/*/local/prefs.conf
$SPLUNK_HOME/etc/apps/*/local/savedsearches.conf
$SPLUNK_HOME/etc/apps/*/local/tags.conf
$SPLUNK_HOME/etc/system/local/prefs.conf
$SPLUNK_HOME/etc/system/local/savedsearches.conf
$SPLUNK_HOME/etc/system/local/authorize.conf
$SPLUNK_HOME/etc/system/local/authentication.conf
$SPLUNK_HOME/etc/system/local/eventtypes.conf
$SPLUNK_HOME/etc/system/local/tags.conf
$SPLUNK_HOME/etc/users/*

and any other custom configuration files that Splunk modifies.

Important: The list above uses the $SPLUNK_HOME/etc/apps/ directory structure for Splunk applications and configuration files introduced in Splunk version 3.3. If you are running an earlier version of Splunk, the path is $SPLUNK_HOME/etc/bundles/.

Blacklist the following directories (don't need to copy over the default files or the READMEs):

/default/
/README/

Example rsync usage

This example rsyncs the local copy of the indicated directories to a Splunk instance on the target (secondary) Splunk server.

rsync -avvr $SPLUNK_HOME/etc/system/local/ <$SPLUNK_HOME/etc/apps/> <TARGET_SERVER>:/$SPLUNK_HOME/etc/ 

Important: This example uses the $SPLUNK_HOME/etc/apps/ directory structure for Splunk applications and configuration files introduced in Splunk version 3.3. If you are running an earlier version of Splunk, the path is $SPLUNK_HOME/etc/bundles/.

If you script rsync to run every minute, you should see this kind of output:

system/spec/web.spec is uptodate
system/static/addeventtype.html is uptodate
system/static/addeventtype_done.html is uptodate
system/static/addtail_done.html is uptodate
system/static/atom.xsl is uptodate
system/static/learn.html is uptodate
total: matches=0  hash_hits=0  false_alarms=0 data=0
 
sent 12583 bytes  received 2744 bytes  30654.00 bytes/sec
total size is 2155770  speedup is 140.65

The first time you rsync, everything will get copied over, but subsequently, it will only copy files that have changed. Rsync uses an algorithm that will only copy the diffs and not the whole file unless you specify -W or --whole-file.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk