Deploy:DeploymentServer
From Splunk Wiki
Recommendation For Sizing
Briefly,
- A small deployment server (30 or fewer clients) can co-reside with a splunk instance which has other duties, such as a search head, indexer, or other splunk instance.
- At moderate to large sizes (30-300), the deployment server should reside on its own splunk instance which does not have other duties.
- The deployment server accesses can interfere with other management port activities, such as search, management, UI functionality, distributed search, etc. etc.
- At moderate sizes, the phoneHomeIntervalInSecs should be increased from its default value of 30 seconds, to a larger value which meets your business goals. Can deployment clients wait 10 minutes to receive updates? Perhaps 600 is more appropriate then.
- In very large deployments, multiple deployement servers should be used, where a ratio of 300 clients per server is known to perform well. We definitely have at least one customer running with 1,000 clients for one deployment server. There are likely scalability issues at larger sizes which are not identified yet.
Issues to be aware of
- Older clients, (before around 4.1.4) would try connections over http if the https connection would time out. This means that an overbooked deployment server would have non-ssl connections coming into its ssl port, which result in errors reported in the server's log, and further may possibly trigger other overload problems in the management port behavior.
- Some state exists where a deployment server is not servicing all phonehome requests, but is succeeding often enough that the clients eventually receive their data. To determine if this sort of situation is occurring, you should search the splunkd.log reset mesages on the deployment clients, to find how soon they reset upon receiving updated bundles from the deployment server.
Example: How To Set Up Deployment Server and Client
##############################################3#
#
# Deployment Server deployment
#
##############################################3#
1. @Server: Configure 'serverclass.conf'
Note: Whenever you edit sererclass.conf, you must restart Splunk!
Note: Once the app is applied by the server, you cannot modify the config at Client.
Note: You MUST edit the app at the Server and deploy it.
Note: serverclass.conf is associated to tenants.conf. Do not delete tenants.conf
############################################
# It is IMPORTANT to have a general filter here, and a more specific filter at the app level.
# An app is matched _only_ if the server class it is contained in was successfully matched!
#
# - Deploy general apps: default, local directories, inputs.conf
#
# [global]
# whitelist.0=*
# restartSplunkd=true
# stateOnClient = enabled
# [serverClass:<Class Name>]
# whitelist.0=<host or ip>
# [serverClass:<Class Name>:app:<App Name>]
# whitelist.0=<host or ip>
# blacklist.0=*
############################################
# Example
#- serverclass.conf
#-------------------------------------------
[global]
whitelist.0=*
restartSplunkd=true
stateOnClient = enabled
[serverClass:FWD2Local]
whitelist.0=*
[serverClass:FWD2Local:app:LWFoutputs]
#-------------------------------------------
2. Place your apps to deploy under $SPLUNK_HOME/etc/deployment-apps directory
Example of custom app, LWFoutputs, in the directory
etc/deployment-apps
etc/deployment-apps/LWFoutputs
etc/deployment-apps/LWFoutputs/local
etc/deployment-apps/LWFoutputs/local/app.conf
etc/deployment-apps/LWFoutputs/local/outputs.conf
- etc/deployment-apps/LWFoutputs/local/app.conf
[install]
state = enabled
build = 10001
- etc/deployment-apps/LWFoutputs/local/outputs.conf
# Forward to local 55513
[tcpout]
defaultGroup = local_55153
[tcpout:local_55153]
server = 127.0.0.1:55153
3. @Server #./splunk enable deploy-server -auth admin:changeme1
4. @Server #./splunk restart
(Whenever serverclass.conf is modified, Splunk needs to be restarted.)
5. @Client #./splunk set deploy-poll <deployment-server>:<mgmtPort> -auth admin:changeme1
=> This will generate 'depeloymentclient.conf'
=> In 1-5 minutes, the app should be found in etc/apps at the Client
=> This might fail in Universal Forwarder or license slave, in such case, edit deploymentclient.conf
- deploymentclient.conf
[target-broker:deploymentServer]
targetUri = <deployment-server host or ip>:<mgmtPort>
=> Option:
- @Client #./splunk enable deploy-client -auth admin:changeme1
- @Client #./splunk disable deploy-client -auth admin:changeme1
6. @Client #./splunk restart
------>>>> More things you can do <<<-------
7. @Server #./splunk list deploy-clients -auth admin:changeme1
- This is a snapshot status, and dynamically changing
- If you want to see all the list of Clients
=> Go WebGUI-> Manager -> Deployment Server -> Your Class -> Status
Option:
-----------------------------------------------------------------
$ $SPLUNK_HOME/bin/splunk list deploy-clients -auth admin:changeme1
Deployment client: ip=10.1.8.28, dns=myana-mbp15.splunk.com, hostname=myana-mbp15.splunk.com, mgmt=8089, build=80534, name=depClient_myana_mbp15_LWF, id=connection_10.1.8.28_8089_myana-mbp15.splunk.com_myana-mbp15.splunk.com_depClient_myana_mbp15_LWF, utsname=darwin-i386
utsname: darwin-i386
name: depClient_myana_mbp15_LWF
ip: 10.1.8.28
hostname: myana-mbp15.splunk.com
build: 80534
dns: myana-mbp15.splunk.com
mgmt: 8089
phoneHomeTime: Fri Jan 7 09:28:30 2011
id: connection_10.1.8.28_8089_myana-mbp15.splunk.com_myana-mbp15.splunk.com_depClient_myana_mbp15_LWF
Deployment client: ip=10.1.8.40, dns=sup-vmbox.splunk.com, hostname=sup-vmbox, mgmt=8089, build=89596, name=deploymentClient, id=connection_10.1.8.40_8089_sup-vmbox.splunk.com_sup-vmbox_deploymentClient, utsname=windows-intel
utsname: windows-intel
name: deploymentClient
ip: 10.1.8.40
hostname: sup-vmbox
build: 89596
dns: sup-vmbox.splunk.com
mgmt: 8089
phoneHomeTime: Fri Jan 7 09:28:57 2011
id: connection_10.1.8.40_8089_sup-vmbox.splunk.com_sup-vmbox_deploymentClient
Option:
-----------------------------------------------------------------
@Server To check the bundle for each app
-------
$ ls -l $SPLUNK_HOME/var/run/tmp
total 0
drwx------ 2 myana staff 68 Jan 6 10:22 FWD2Local
$ ls -l $SPLUNK_HOME/var/run/tmp/FWD2Local
total 24
-rw------- 1 myana staff 10240 Jan 6 14:30 LWFoutputs-1294353032.bundle
@Client to check if the app was "being" deployed
(This bundle might be gone quickly after the app is deployed)
-------
bash-3.2$ ls -l /Applications/splunk_413/var/run
total 8
drwx------ 5 myana staff 170 Mar 21 18:48 FWD2Local <======= Here you can find the downloaded class!
drwx------ 2 myana staff 68 Jul 15 2010 searchpeers
-rw------- 1 myana staff 824 Apr 4 14:29 serverclass.xml
drwx--x--x 11 myana staff 374 Apr 4 14:30 splunk
8. After you changed any app configurations.
@Server #./splunk reload deploy-server -class <className> -auth admin:changeme1
(DEBUG: ./splunk reload deploy-server -auth admin:changeme1 -debug )
9. You must restart Splunk whenever serverclass.conf was edited
Troubleshooting Splunk Deployment Server and Client
######################################################
#
# Troubleshooting Deployment Server and Client
#
######################################################
#
# Check if Deployment Server/Client was enabled or disabled
#
- From command line
@Server # ./splunk display deploy-server
@Client # ./splunk display deploy-client
# For Deployment server, tenants.conf is to set disable/enable.
Here is an example of tenants.conf which disabled default serverclass.conf.
- tenants.conf
[tenant:default]
whitelist.0 = *
disabled = true
#
# Log $SPLUNK_HOME/etc/log-local.cfg:
# Enable DEBUG settings on the both instances in log.cfg
# (can be done via the UI: Manager -> System Settings -> System logging )
#
- From command line
@Server # ./splunk set log-level DeploymentServer -level DEBUG
@Client # ./splunk set log-level DeploymentClient -level DEBUG
- From log.cfg (Must restart Splunk after changing this log.cfg)
[splunkd]
category.DeploymentServer = DEBUG
Or,
category.DeploymentClient=DEBUG
And, possibly
category.HTTPClient = DEBUG
category.TcpInputProc = DEBUG
category.TcpOutputProc = DEBUG
#
# Recommendation for scalability concern
#
- A small deployment server (30 or fewer clients) can co-reside with a splunk instance which has other duties, such as a search head, indexer, or other splunk instance.
- At moderate to large sizes (30-300), the deployment server should reside on its own splunk instance which does not have other duties.
- The deployment server accesses can interfere with other management port activities, such as search, management, UI functionality, distributed search, etc. etc.
- At moderate sizes, the phoneHomeIntervalInSecs should be increased from its default value of 30 seconds, to a larger value which meets your business goals. Can deployment clients wait 10 minutes to receive updates? Perhaps 600 is more appropriate then.
- In very large deployments, multiple deployement servers should be used, where a ratio of 300 clients per server is known to perform well. We definitely have at least one customer running with 1,000 clients for one deployment server. There are likely scalability issues at larger sizes which are not identified yet.
# Issues to be aware of
- Older clients, (before around 4.1.4) would try connections over http if the https connection would time out. This means that an overbooked deployment server would have non-ssl connections coming into its ssl port, which result in errors reported in the server's log, and further may possibly trigger other overload problems in the management port behavior.
- Some state exists where a deployment server is not servicing all phonehome requests, but is succeeding often enough that the clients eventually receive their data. To determine if this sort of situation is occurring, you should search the splunkd.log reset messages on the deployment clients, to find how soon they reset upon receiving updated bundles from the deployment server.
- Whitelist and Blacklist are applied to 'clientName, ip address, host name in DNS record, and Splunk host name
=> Check the output of "splunk list deploy-clients -auth admin:changeme"
- Copy from the $SPLUNK_HOME/etc/system/README/serverclass.conf.spec
-------------------------------------------------------------------
whitelist.<n> = <clientName> | <ip address> | <hostname>
blacklist.<n> = <clientName> | <ip address> | <hostname>
* 'n' is a number starting at 0, and increasing by 1. Stop looking at the filter when 'n' breaks.
* The value of this attribute is matched against several things in order:
* Any clientName specified by the client in its deploymentclient.conf file
* The ip address of the connected client
* The hostname of the connected client as provided by reverse DNS lookup
* The hostname of the client as provided by the client
* All of these can be used with wildcards. * will match any sequence of characters. For example:
* Match an network range: 10.1.1.*
* Match a domain: *.splunk.com
* These patterns are PCRE regular expressions with the additional mappings:
* '.' is mapped to '\.'
* '*' is mapped to '.*'
* Can be overridden at the serverClass level, and the serverClass:app level.
* There are no whitelist or blacklist entries by default.
-------------------------------------------------------------------
#
# Deployment Server/Client do not start
#
- Make sure DeploymentNG pipeline is enabled
The splunkd.log can tell if the module is enabled when starting Splunk
=> Default is enabled. Only if it's changed manually, it is disabled.
- default-mode.conf
[pipeline:distributedDeploymentNG]
Disabled = false
#
# Deployment server problems
#
My apps are not appearing on my client instances ? Things to check for:
- Is the client trying to contact the correct server?
- Is it getting a connection?
- Is the server matching it correctly in serverclass.conf?
- Are whitelist/blacklist in use? Are their regex correct?
#
# Deployment Client error
# => Cannot connect to the Server
# ==> Probably Deployment server side issue. No port is available, or sockets are used up
#
This splunkd.log at Client is okay (DEBUG mode)
------------------------------------
08-16-2011 17:55:09.258 -0700 DEBUG DeploymentClient - PhoneHomeThread woke up
08-16-2011 17:55:09.258 -0700 DEBUG DeploymentClient - Send phone home
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - Current state: 3, new state: 4
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - Phone home recvd reply: <?xml version="1.0" encoding="UTF-8"?>
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - DeploymentClient is about to reload a new manifest....
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - stateOnClient=enabled
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - DeploymentClient is done reloading...
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - Current state: 4, new state: 3
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - Sent phonehome to deployment server on topic: deploymentServer/phoneHome/default
08-16-2011 17:55:09.267 -0700 DEBUG DeploymentClient - Phonehome thread waiting for :15000 mecs
08-16-2011 17:55:24.268 -0700 DEBUG DeploymentClient - PhoneHomeThread woke up
08-16-2011 17:55:24.268 -0700 DEBUG DeploymentClient - Send phone home
08-16-2011 17:55:24.277 -0700 DEBUG DeploymentClient - Current state: 3, new state: 4
08-16-2011 17:55:24.277 -0700 DEBUG DeploymentClient - Phone home recvd reply: <?xml version="1.0" encoding="UTF-8"?>
08-16-2011 17:55:24.277 -0700 DEBUG DeploymentClient - DeploymentClient is about to reload a new manifest....
08-16-2011 17:55:24.277 -0700 DEBUG DeploymentClient - stateOnClient=enabled
08-16-2011 17:55:24.278 -0700 DEBUG DeploymentClient - DeploymentClient is done reloading...
08-16-2011 17:55:24.278 -0700 DEBUG DeploymentClient - Current state: 4, new state: 3
08-16-2011 17:55:24.278 -0700 DEBUG DeploymentClient - Sent phonehome to deployment server on topic: deploymentServer/phoneHome/default
08-16-2011 17:55:24.278 -0700 DEBUG DeploymentClient - Phonehome thread waiting for :15000 mecs
#
# Deployment Client
# => Permission issue in the directory to expand, or the files in the archived app.
#
05-27-2010 15:18:07.106 WARN DeployedApplication - Installing app: windows to location: C:\Program Files\Splunk\etc\apps\windows
05-27-2010 15:18:07.106 ERROR DeployedApplication - There was a problem unarchiving file to: C:\Program Files\Splunk\etc\apps\windows\local\service?WSDL due to The filename, directory name, or volume label syntax is incorrect
#
# Deployment Client
# => Applied app's configuration is not available in WebGUI
#
This is not a Deployment Server/Client issue. If you deploy a configuration file which is for system wide configuration such as email setting or authentication etc, you must
exit apps/<app name>/metadata/local.meta and add export = system for the configuration or default.
- local.meta
[]
export = system
Example: How to propagate apps from Primary to Secondary Deployment Server
################################################################### # # How to propagate apps from Primary to Secondary Deployment Server # ( Assuming you already have in apps in default repository in "deployment-apps" dir. ################################################################### # # Primary Deployment Server # - splunkd-port 55041 # - address: 10.1.1.10 - serverclass.conf [global] whitelist.0=* #blacklist.0 = * restartSplunkd = true stateOnClient = enabled [serverClass:UF] [serverClass:UF:app:LWFoutputs] # # Secoudary Deployment Server # - splunkd-port 55051 # - address: 10.1.1.10 # - deploymentclient.conf [deployment-client] disabled = false repositoryLocation = $SPLUNK_HOME/etc/deployment-apps serverRepositoryLocationPolicy = rejectAlways reloadDSOnAppInstall = true [target-broker:deploymentServer] targetUri = 10.1.1.10:55041 - serverclass.conf [global] whitelist.0=* #blacklist.0 = * restartSplunkd = true stateOnClient = enabled # This works [serverClass:UF] [serverClass:UF:app:LWFoutputs] # # End Deployment Client # - splunkd-port 55001 # - address: 10.1.1.10 # - deploymentclient.conf [target-broker:deploymentServer] targetUri = 10.1.1.10:55051 [deployment-client] disabled = false
There are plenty of other possibilities, DEBUG logging will highlight any errors on either instance