Community:Deploy:How To Set Up Search Head Pooling and Shared Bundle
From Splunk Wiki
!!! Search Head Pooling is deprecated in v6.2 !!!
Search Head pooling and Shared Bundles are are great features for Splunk distributed search deployment. If you're planning to use Search Head Pooling, you should use Shared Bundles feature, too.
However, it is a little bit confusing when it comes to setups. Some misunderstand and think they can use the same shared directory.
They are totally different features. And, the shared directories are slightly different.
- Search Head pooling is for search heads to share configurations and search jobs. And, the shared directories are "etc/apps", "etc/users", and "var/run/splunk/dispatch" and some others.
- Shared Bundle is for a search head and an indexer to share a configuration bundle. And, the shared directories are "apps", "users" and "system".
- SHP Deployment - Additional feature for Distributed Search - HA solution for Search Heads in Distributed Search topology (But, Single Point of failure ) - Sharing Apps/Users/Search Jobs among Search Heads - Should count on external Authentication system instead of local authentication feature - Deployment Server should point the shared location to deploy apps - Difficult to upgrade - No GUI for configuration/management - Common Performance issue due to poor nfs IO performance - Should forward summary/_audit/_internal data to indexers
- MountedBundle Deployment - Additional feature for Distributed Search - Sharing Search time configurations on a network shared file system (But, Single Point of failure ) - Sharing System/Apps/Users between a Search Head and Search Peers(Indexers) - It can avoid SHPooling topology's low bundle replication performance - No GUI for configuration/management
Splunk Design Consideration with SH Pooling and Mounted Bundles
- Keep the same version over the distributed search including search peers This is not special for search head pooling or sharing bundles. Distributed Search topology with cross-version environment has limited functionality. It's because different version has additional features or configuration parameters. Search performance might be slower. Or, you might have some warnings or errors due to cross-version.
- "splunk btool fix-dangling" after manually configuring a configuration file or using a deployment server Under Search Head Pooling, every deletable stanza (i.e. stanza that exists purely in local) must have a corresponding ACL. Search heads interpret stanzas *without* ACLs as being "in-progress writes". They are not considered "complete" until the ACL also lands on disk and is detected. fix-dangling writes "stub" ACLs to any stanzas that lack them. There is nothing preventing puppet or other tools from deploying local.meta files with the appropriate ACLs set. But, you must be be aware of this extra "validity", fix-dangling requirement when deploying to your search head pooling. To be clear, fix-dangling is not really required. It doesn't do anything special. Users just need to know to modify local.meta in addition to local/*.conf or to run fix-dangling on a representative SHP instance before bundling and pushing via deployment server, puppet, etc. That said, if you're deploying only default directories, this is not required because fix-dangling is only required for stanzas that exist purely in "local".
- Do not use this nfs shared solution in WAN network. Any general network file shared systems are not for WAN network. The nfs, smba or, CIFS should not be used in WAN network. The performance is terribly slow.
- Do not keep [pooling] stanza in a copied "system/local/server.conf" for Mounted Bundle
- This is a bug we experienced in 4.3.x, 5.0 -5.0.3
- When using Moutend Bundles with Search Head Pooling, our doc suggests to copy "system" directory to a search head pooling shared directory. So, this "system" directory is a static directory copied from one of a search head. If a server.conf in the <shared directory>/system/local directory (or <shared directory>/etc/system/local ) contains [pooling] stanza with enabled storage location, search peers which enabled "Mounted Bundles" looks for "incorrect" or "non-existing" apps directory by following the [pooling] stanza. As a result, search on a search head in the SH Pooling will return an error message saying missing "apps" location or something... This is a bug, and unfortunately several people had great amount of time for troubleshooting this issue. A workaround is to remove the [pooling] stanza and its attributes or disable [pooling] stanza in the server.conf in <shared directory>/system/local ( or, <shared directory>/etc/system/local ). Usually we avoid documenting a bug in this page. Instead, we put everything in official Known Issue report. So, please visit our Known Issue page if there is any update for this bug.
- Configuring/managing apps and system wide configurations through WebGUI or Deployment Server
- All of the apps related configurations through WebGUI will be saved under apps in SP Pooling shared location.
- Some configuration through WebGUI will be saved in $SPLUNK_HOME/etc/system/local, which is not SH Pooling shared location. This includes general settings(server.conf and web.conf), email settings(alert_actions.conf), distributed search(distsearch.conf), authentications(authentication.conf), and user roles(authorize.conf). For search time, roles(authorized.conf) setting needs to be shared with search peers(indexers). So, if you need to deploy any of them through Deployment Server, please make sure you do not edit them through WebGUI. These configurations through WebGUI will override apps configurations. (Visit "apps and configuration precedence" )
- If you're using Mounted Bundle with SH Pooling, probably the shared "system" directory is static and will not be updated when you edit any of the configuration above if they are edited through WebGUI. That is usually okay except for one, authorize.conf. For distributed search, "roles"(authorize.conf) must be sent to search peers(indexers). So if you're using Mounted Bundle with SH Pooling, you must use Deployment Server to deploy authorize.conf through an app to the SH Pooling shared location. (Or, if you're not using a Deployment Server, you can just edit an app in SH Pooling shared directory's app, and add an authorized.conf.) Again, if you need to deploy authorize.conf through Deployment Server, please make sure you do not edit Roles through WebGUI. These configurations through WebGUI will override apps configurations.
- If you would like to deploy LDAP authentication(authentication.conf) through an app in SP Pooling shared point, you should consider it only when you can use anonymous bind for LDAP server. Otherwise, it is very complicated to achieve and maintain it. The difficulty comes from how Splunk encrypts any passwords in configuration files. Splunk encryption including LDAP bind password, certificate's password, local user passwords are done by ecch Splunk's splunk.secret, which is located in $SPLUNK_HOME/etc/auth directory. So, if you'd like to share authentication.conf including encripted bind password in an app in SH Pooling, all the Search Heads in the SH Pooling need to share splunk.secret. If you do that, you also need to re-encrypt the existing passwords. So, it is technically possible to share authentication.conf in a shared pooling directory. However, it is very diffiult to maintain. All in all, we do not recommend to share authentication.conf in an app in SP Pooling.
- All of the apps related configurations through WebGUI will be saved under apps in SP Pooling shared location.
- Splunk's Deployment Server to deploy apps in Search head pooling location
If you use Splunk's Deployment Server, you have to set value of "targetRepositoryLocation" in serverclass.conf to the apps directory under search head pooling location. ( Note that Deployment Server cannot deploy configurations to $SPLUNK_HOME/etc/system. This is not supported. )
- serverclass.conf [global] targetRepositoryLocation = <search head pooling directory>/etc/apps
Or, probably you would not like to have this for all the deployment clients. In general, you should configure it in deploymentclient.conf
- deploymentclient.conf [deployment-client] serverRepositoryLocationPolicy = rejectAlways repositoryLocation = /mnt/SH_pool/etc/apps
- Deployed through Deployment Server or manually configured app's configuration is not available in WebGUI
This is not a Deployment Server/Client issue, nor Search Head pooling. If you deploy a configuration file which is for system wide configuration such as email setting or authentication etc, you must edit apps/<app name>/metadata/local.meta and add export = system for the configuration or default.
- local.meta  export = system
- Some scripted inputs through apps through Splunk base or manually configured apps fail to run in SH Pooling
This might happen if the stanza of the scripted inputs use absolute path, such as $SPLUNK_HOME/etc/apps/<app name>/bin. Make sure you use a relative path, instead.
- inputs.conf [script://./bin/<script>] ...
Sample of error messages in splunkd.log
09-14-2012 07:55:53.810 -0700 ERROR ExecProcessor - Ignoring: "/opt/splunkes/etc/apps/myTestApp/bin/sysmon.sh" 09-14-2012 07:55:53.810 -0700 ERROR FrameworkUtils - Incorrect path to script: /opt/splunkes/etc/apps/myTestApp/bin/sysmon.sh. Script must be in a bin subdirectory in $SPLUNK_HOME.
- Summary Index with SH Pooling Forward all data to indexers, instead of indexing events in each SH Pooling . Especially this is important for Summary Indexing. Summary Indexing is an action as a result of a scheduled search. In general, summary index database is in the search head where a summary-index-enabled schedule search was run. In SH Pooling topology, a scheduled search is shared by the Search Heads in the SH Pooling. One of the SHs runs a scheduled search and do summary indexing. There is no way to control to decide which search head in the pool should run a scheduled search. As a result, each Search Head in the SH Pooling will keep only partial summary indexed events. When a search head searches events in summary index, the search query only goes to search peers and its local index database, not other search heads. In order to avoid this situation, all indexing events including summary index in Search Heads in SH Pooling should be forwarded to indexers.
- Forwarding events to Search Peers(Indexers) from SHs in the SH Pooling As default, Splunk will not forwarder any events for _internal database. This could result in failing to use _internal database for search, scheduled search, alert, summary index, and chart views. Especially, Deployment Monitor and Splunk on Splunk(SoS) app might not work well in SHP without forwarding events for _internal db because they requires _internal database. If you want to forward all internal index data, not just the data in _audit, you can override the default forwardedindex filter attributes like this:
# # Example # Forward all internal index data # - outputs.conf #Forward everything [tcpout] forwardedindex.filter.disable = true
- System Clock Synchronization Between SHs and NFS server Make sure the system clock between SHs and NFS server are synchronized. Use NTP server to keep clock synchronization. When system clocks are off, search might be canceled due to hitting time-out of a search. Also, minimizing clock jitter/drift also cause unexpected search cancellation. Please make sure you use reliable NTP servers to synchronize system clocks.
- Bundle Replication filters Bundle Replication sent from SH to search peers contains configuration files and others from system, apps, and users directories. As default, distsearch.conf $SPLUNK_HOME/etc/system/default directory defines which files should be included in a bundle. Usually a user might need to add more files to exclude configuration or undesired large files in order to reduce the size of a bundle. You should not include configuration or any files which are already excluded as default. This might cause unexpected behavior at search time. We've seen a case a user included all of the files in a bundle accidentally by overwriting distsearch.conf.
- Upgrading SH Pooling Distributed Search topology Please check Splunk online manual for proper steps. It is very difficult to keep a Search Head running during the upgrading process. You should be prepared for maintenance downtime for a while. A backup of all the configuration and test in non-production environment are important. Failing to follow proper steps may introduce unexpected issues. You will end up with starting upgrade process over!!!
- One way to upgrade SHs in SHP is as follows
Assuming there are two SHs, SH01, SH02 and SH03, in SHP 1. @SH01, Stop it and disable SHPooling 2. @SH01, copy the shared apps in pooling directory to SH01's $SPLUNK_HOME/etc/apps directory 3. @SH01, upgrade the SH01 as normal upgrade process => @SH01, make sure you can login and run search, visit "settings" etc. 4. @SH02/03, Stop them, and disable SHPooling for SH02, SH03 5. @SH01, stop SH01 and copy SH01's apps from $SPLUNK_HOME/etc/apps to the shared directory 6. @SH01, enable SHPooling and start it => Make sure you can login and run searches, visit "settings", job activity etc. 7. @SH02/03, Upgrade the SH02 and SH03 as general upgrade process (No SHP enabled ) 8. @SH02/03, Stop them, enable SHPooling, and start them => Make sure you can login and run searches, visit "settings', job activity etc.
- Search Performance Consideration Search Performance in SH Pooling would be slower than one in non-SH Pooling. This is simply because NFS disk access is a lot slower than local disk. So, you should consider enterprise level of dedicated storage network with SAN device. In general nfs server under shared network, we've seen terrible performance issue if pinging to nfs server from a search head takes > 1ms. ( In 4.3.3, performance for search with sub-search was improved. In 4.3.5, performance for search and WebGUI slowness for Manager page was improved. )
- More about Search performance An nfs server's performance is very important with SH Pooling and Mounted Bundles. Especially for SH Pooling. It is because SH Pooling location is used to keep search jobs(search status, results, and logs). The more you run heavy searches, the more you need superior disk I/O performance in the nfs server. Some nfs server parameters like rsize and wsize could help throughput. It might introduce increased latency. But, that's a trade-off.
- Slower GUI experience With SHP, loading views and going to manager may be slow. This UI experience happens by clicking Manager/Logo/Apps Performance. This performance should be improved in 4.3.6 or 5.0.3.
- Do not use the same serverName in SHs As a good practice, we should not set a SH as a search peer for other indexers. It is good to keep roles separated for maintenance and also performance. For SH Pooling, if serverName is same for both SHs and they are a search peer for the other SHs in the same SH pooling. It fails to get a result from the other SHs. Usually, you should see a warning message when you add a new search peer through WebGUI for this kind of situation.
How To Set Up Search Head Pooling
This example is to set up both features in the distributed search environment. Please remember that there are more than one approach to configure them.
#-------------- # SH Pooling #-------------- # # @NFS Server :/home/shared/searchpooling # @Search Head :/mnt/SH_pool # # Subdirectories in the directory of the SH Pooling location # Dirs: # etc/apps/<apps> # etc/users/<usernames> # var/run/splunk/dispatch/<search jobs> # # -server.conf # [pooling] # state = enabled # storage = /mnt/SH_pool # 1. @NFS Server: Create the Search Head pool directory and share/export it => NFSserver:/home/shared/searchpooling in this example 2. @Search Head: Mount the shared directory from the NFS server to the search head's mount point => The mount point is /mnt/SH_pool in this example # mount -t nfs NFSServer:/home/shared/searchpooling /mnt/SH_pool 3. @Search Head: Create etc directory under the mount point /mnt/SH_pool # mkdir /mnt/SH_pool/etc 4. @Search Head: Configure server.conf -server.conf [pooling] state = enabled storage = /mnt/SH_pool 5. @Search Head: Stop the Splunk # splunk stop 6. @Search Head: Copy the existing apps and users directories from the Search Head's Splunk dir to the mount point # cp -r $SPLUNK_HOME/etc/apps /mnt/SH_pool/etc/ # cp -r $SPLUNK_HOME/etc/users /mnt/SH_pool/etc/ 7. @Search Head: Start the Splunk => This will take a minute or more in order to refer to the pool for apps, users and var/run/splunk(Splunk creates this). # splunk start 8. @Other Search Head: Step 2, 3, 4, 5, and 7 (No Step 6)
# #-------------- # Shared Bundles #-------------- # # Required subdirectories for the shared bundle # Dirs: # apps # users # system # # # A simple way is # 1. Mount the same point as SH Pooling, # 2. Copy $SPLUNK_HOME/etc/system under the etc in the mounted point # 3. Configure distsearch.conf to point etc directory # # @NFS Server :/home/shared/searchpooling # @Search Head :/mnt/SH_pool # 1. Assuming the SH Pooling was configured as above procedure 2. @Search Head: Copy the existing system from Search Head to the mount point => Note that this copied system directory will be static and will not be changed dynamically though WebGUI # cp -r $SPLUNK_HOME/etc/system /mnt/SH_pool/etc/ 3. @Search Head: Disable sending the bundle - distsearch.conf [distributedSearch] shareBundles = false 4. @Indexer/searchpeers: Enable the shared bundle - distsearch.conf [searchhead:SearchHead01-root] mounted_bundles = true bundles_location = /mnt/SH_pool<b>/etc</b> => Note that the host name is <hostname>-<splunk user> as default. It is defined as serverName in server.conf at a Search Head => To find out the proper searchhead name, check "remote" jobs in the dispatch directory of the indexer, or check serverName in server.conf at the Search Head. # cd $SPLUNK_HOME/var/run/splunk/dispatch/ # ls | grep remote | head 5 remote_SearchHead01-root_scheduler__admin__search_VGVzdCBBQUE_at_1315712400_2f4dfd650934e0d9 remote_SearchHead01-root_scheduler__admin__search_VGVzdCBBQUE_at_1315712700_4a05440d41403c2a remote_SearchHead01-root_scheduler__nobody__search_SW5kZXhpbmcgd29ya2xvYWQ_at_1315712700_1c5ec20a8ec47ccb remote_SearchHead01-root_scheduler__nobody__search_VG9wIGZpdmUgc291cmNldHlwZXM_at_1315712400_69a94efac4644c64 remote_SearchHead01-root_scheduler__nobody__SplunkDeploymentMonitor_RE0gbWlzc2luZyBmb3J3YXJkZXJz_at_1315712700_ab02c550189854cb 5. @Indexer/Searchpeers: Repeat Step 9 for each search head stanza in each indexer 6. Restart the Search Heads, and then the Indexers # # Alternative way # (More complicated. Not sure when you need this. But, this works, too.) # # # @NFSServer :/home/shared/sharedbundle # @searchpeers:/mnt/SH_bundle # 1.@Search Head: Create "system" directory symlink to /mnt/SH_pool/etc/ ln -s $SPLUNK_HOME/etc/system /mnt/SH_pool/etc/system 2.@NFS Server: Create a share directory for Shared bundle => NFSServer:/home/shared/sharedbundle in this example 3.@NFS Server: Create apps directory symlink to /home/shared/searchpooling/apps # ln -s /home/shared/searchpooling/etc/apps /home/shared/sharedbundle/apps 4.@NFS Server: Create users directory symlink to /home/shared/searchpooling/users # ln -s /home/shared/searchpooling/etc/users /home/shared/sharedbundle/users 5.@NFS Server: Create system directory symlink to /home/shared/searchpooling/system # ln -s /home/shared/searchpooling/etc/system /home/shared/sharedbundle/system 6.@NFS Server: Share/Export the shared bundle directory 7.@Indexer/searchpeers: Mount the shared bundle of the NFS server to the mount point mount -t nfs NFSServer:/home/shared/sharedbundle /mnt/SH_bundle 8. @Search Head: Disable sending the bundle - distsearch.conf [distributedSearch] shareBundles = false 9. @Indexer/searchpeers: Enable the shared bundle - distsearch.conf [searchhead:SearchHead01-root] mounted_bundles = true bundles_location = /mnt/SH_bundle => Note that the host name is <hostname>-<splunk user> as default. It is defined as serverName in server.conf at a Search Head => To find out the proper searchhead name, check "remote" jobs in the dispatch directory of the indexer, or check serverName in server.conf at the Search Head. # cd $SPLUNK_HOME/var/run/splunk/dispatch/ # ls | grep remote | head 5 remote_SearchHead01-root_scheduler__admin__search_VGVzdCBBQUE_at_1315712400_2f4dfd650934e0d9 remote_SearchHead01-root_scheduler__admin__search_VGVzdCBBQUE_at_1315712700_4a05440d41403c2a remote_SearchHead01-root_scheduler__nobody__search_SW5kZXhpbmcgd29ya2xvYWQ_at_1315712700_1c5ec20a8ec47ccb remote_SearchHead01-root_scheduler__nobody__search_VG9wIGZpdmUgc291cmNldHlwZXM_at_1315712400_69a94efac4644c64 remote_SearchHead01-root_scheduler__nobody__SplunkDeploymentMonitor_RE0gbWlzc2luZyBmb3J3YXJkZXJz_at_1315712700_ab02c550189854cb 10. @Indexer/Searchpeers: Repeat Step 9 for each search head stanza in each indexer 11. Restart the Search Heads, and then the Indexers