Community:Multi-tenant scenario

From Splunk Wiki

Jump to: navigation, search

Deployment scenario: multi-tenant Splunk deployment with minimal hardware

Many Splunk customers wish to compartmentalize their IT data. Among the reasons for this requirement are the desire to limit access of one business unit's data to only employees of that unit.

Along with this requirement for data segregation, there is often a companion requirement for the central business unit to have the ability to search the IT data in each sub-unit.

Most customers also wish to deploy the minimum amount of hardware that is feasible given the volume of IT data they are indexing and searching per day.


Requirements

Create a multi-tenant IT search environment

It is possible to create a unique index for each business unit on a single Splunk instance and limit each business unit to search only their specific business unit's index.

Users in each business unit can only search their own IT data

Creating a unique instance of Splunk, per business unit, is another option available to provide data compartmentalization. Since each instance is completely independent of the others, there is no risk of information leaks.

Allow corporate the ability to search across all IT data

This deployment approach also meets the requirement for the central business unit to perform a search across all corporate IT data.

This capability is made possible by leveraging Splunk's distributed search feature. A corporate instance of Splunk is set up that distributes search requests to each business unit's Splunk instance.

Since distributed search is configured on a per-instance basis, it is not transitively possible for the sub-units to distribute search requests to the corporate or other sub-unit instances.

Maximize the utilization of hardware

Finally, in many cases each business unit's instance will only be indexing a few GB per day.

Assuming that network bandwidth is not severely limited, there is no reason not to install several Splunk instances on the same hardware to keep costs down while maximizing hardware utilization.


Deployment details

Hardware

The following hardware was used in this deployment:

  • (2x) 4-core 2.4 Ghz Intel Xeon CPU
  • 12 GB RAM
  • 146 GB 10k RPM SAS HDD
  • QLogic qla2400 FC-HBA
  • 1 TB SAN storage

As a soft limit, it is not recommended that more than ten instances of Splunk are installed on a given physical machine. However, given the disparity between low and high-end commodity hardware, it is recommended that each deployment be staged and tested to determine limits to scalability. Data sources, input methods, and indexing volumes

The data sources and their respective input methods for this deployment were as follows:

  • Windows event logs
    • Tailed via a Splunk lightweight forwarder and output to TCP port 999x on the business unit's Splunk instance (x corresponds to the instance's specific listening port)
    • Estimated volume per day, per site: 2 GB
  • Network syslog
    • Each business unit's network devices output syslog to their designated Splunk instance on UDP port 51x (x corresponds to the instance's specific listening port)
    • Estimated volume per day, per site: 500 MB
  • Mainframe reports
    • Each business unit's mainframe outputs nightly reports to unique files on an exported IFS volume which the business unit's Splunk instance tails
    • Estimated volume per day, per site: 100 MB

To get these estimates, we analyzed the available archive data and calculated daily averages. In some cases, there was approximately one week of archive data available, while in other cases more thorough archive data was available.

Licensing

Using the estimates above, each instance is projected to index approximately 2.6 GB per day.

Since smaller sites are likely to index slightly less than this while larger sites might index slightly more, a fair estimate of total indexing volume is 18.2 GB per day. This would dictate a 20 GB license divided into six 3 GB licenses and one 2 GB licenses.

The licensing and input assumptions should be validated during the first 30 days in production. It is a low risk endeavor, as Splunk's licensing model will never terminate forwarding or indexing, and furthermore will only disable search if the license is violated 7 times within a 30 day rolling window.

Server Installation

Server installation was performed using the standard linux tarball downloaded from http://www.splunk.com/download. The Splunk binaries and logs will go under /opt, while the data store will go on the SAN storage which is mounted at /mnt/i01. The exported IFS volume from each mainframe is mounted under /mnt/ifs/site_name. Many of the steps below can be automated using simple shell, python, or perl scripts.

Note: In many cases, it is not advisable to index directly to a SAN; however, in this case performance was benchmarked and tested under load before implementing in production.

  • Change directory to /mnt/i01
  • Create a directory that corresponds to the name of each instance of Splunk that will be installed on the server.
  • Change directory to /opt
  • Unzip the compressed tarball
  • Extract the uncompressed tarball
  • Recursively copy the extracted /opt/splunk directory to each instance name
  • Manually edit each ./splunk_*/etc/splunk-launch.conf file and make the following changes:
  • Uncomment the $SPLUNK_HOME line and change to $SPLUNK_HOME=/opt/splunk_n (where n corresponds to the particular instance)
    • Uncomment the $SPLUNK_DB line and change to $SPLUNK_DB=/mnt/i01/splunk_n (where n corresponds to the particular instance)
  • Install licenses for each version
    • Open /opt/splunk_n/etc/splunk.license and add the license key specific to the instance
  • Remove the /opt/splunk directory

The bash script below achieves the same end:

#!/bin/bash

LIC=( "empty0" "<license1>" "<license2>" "<license3>" "<license4>" "<license5>" "<license6>" "<license7>"      

for (( i=1;i<=7;i+=1 )) ; 
do
  cd /opt;
  mkdir /mnt/i01/splunk_$i;
  cp -R ./splunk ./splunk_$i;
  echo -e "\$SPLUNK_HOME=/opt/splunk_$i\n\$SPLUNK_DB=/mnt/i01/splunk_$i\n" > ./splunk_$i/etc/splunk-launch.conf;
  echo -e "${LIC[i]}" > ./splunk_$i/etc/splunk.license;
  rm -rf ./splunk;
done

Authentication & Authorization Configuration

To ensure that each business unit's Splunk users can only log in to their intended Splunk instance, we configured Splunk to authenticate to Active Directory(AD) and mapped security groups from AD to roles within Splunk.

  • For each instance, we created an authentication configuration that bound to the directory server corpdc01 with the bind account "svc_splunk". The default LDAP port 389 was used since secure LDAP was not a required by the corporate security policy.
  • "ou=splunk,ou=security_groups,dc=corp,dc=local" off of the directory root was configured as the group base, from which Splunk will enumerate security groups to be mapped to internal Splunk roles.
  • "ou=users,dc=corp,dc=local" was configured as the user base, from which Splunk will enumerate users that belong to the aforementioned security groups mapped to Splunk roles.
  • A security group was created for corporate administrators and mapped to the "admin" role within Splunk. Site-specific security groups were created for power users and standard users and mapped to the "power" and "user" roles within Splunk.

The configuration below was taken from the CHI Splunk instance's /opt/splunk/system/local/authentication.conf:

[CHI]
Admin = CORP_splunk_admin;
Power = CHI_splunk_power, CORP_splunk_power;
User = CHI_splunk_user, COPR_splunk_user;
SSLEnabled = 0
bindDN = cn=svc_splunk,ou=service_accounts,ou=users,dc=corp,dc=local
bindDNpassword = 
failsafeLogin = admin
failsafePassword = 
groupBaseDN = ou=splunk,ou=security_groups,dc=corp,dc=local;
groupBaseFilter = (objectclass=*)
groupMappingAttribute = dn
groupMemberAttribute = member
groupNameAttribute = cn
host = corpdc01
pageSize = 800
port = 389
realNameAttribute = name
userBaseDN = ou=users,dc=corp,dc=local
userBaseFilter = (objectclass=*)
userNameAttribute = sAMAccountName
_actions = new,edit,delete

[auth]
authSettings = CHI
authType = LDAP

Receiving Configuration

Each instance must have at least three unique TCP ports and one unique UDP port to bind to. In this deployment, the following ports were decided upon:

Instance name SplunkWeb port Splunkd port Splunk data port Syslog Port
splunk_1 8001 8091 9991 511
splunk_2 8002 8092 9992 512
splunk_3 8003 8093 9993 513
splunk_4 8004 8094 9994 514
splunk_5 8005 8095 9995 515
splunk_6 8006 8096 9996 516
splunk_7 8007 8097 9997 517

To force each instance of Splunk to listen on the selected ports on startup, /opt/splunk_n/system/local/web.conf and /opt/splunk_n/system/local/inputs.conf were created before the initial startup. While we are editing inputs.conf, we added the tailed IFS directory as well.

For example, /opt/splunk_1/system/local/web.conf contained the following:

[settings]
mgmtHostPort = localhost:8091
httpport = 8001

/opt/splunk_1/system/local/inputs.conf contained the following:

[splunktcp://9991]
disabled = false
queue = parsingQueue
sourcetype = tcp-9991

[udp://511]
disabled = false
sourcetype = syslog

[tail:///mnt/ifs/chi/]
disabled=false
followTail=1

After configuring each instance of Splunk as such, start the instance and accept the license agreement:

/opt/splunk_1/bin/splunk start --accept-license

Forwarder Installation & Configuration

After changing the network devices of each business unit to direct its syslog output to the correct UDP port, we installed Splunk in a lightweight forwarder configuration on the Windows servers as follows (the examples below assume the target instance is splunk_1 on server Splunk1):

  • Download the Windows binary from http://www.splunk.com/download
  • Install interactively using the MSI, accepting all defaults (i.e. run as local system and index all event logs)
  • Start splunk if it does not start automatically, and ensure that the service is set to start automatically at boot
  • Open c:\program files\splunk\etc\splunk.license with a text editor and paste in your forwarder license
  • Issue the following commands from a command prompt
c:\program files\splunk\bin> splunk.exe disable webserver
c:\program files\splunk\bin> splunk.exe set server-type forwarder
  • Modify c:\program files\splunk\system\local\inputs.conf by placing the following at the top of the file:
queue=indexQueue
  • Create c:\program files\splunk\system\local\outputs.conf as follows:
[tcpout:splunk_1]
server=Splunk1:9991
  • Restart the forwarder
c:\program files\splunk\bin> splunk.exe restart
  • Change the default admin password
c:\program files\splunk\bin> splunk.exe edit user admin -password fflanda

After configuring all Windows forwarders as detailed above, we verified that each Splunk index server received new events from the domain controller event logs, network devices, and (the next day) the mainframe reports from the IFS share.

Conclusion

Aside from meeting all the project requirements, there are substantial maintenance benefits to this deployment model:

  • If one business unit experiences performance problems with search or indexing, the impact of the problems is limited to their particular instance.
  • Routine maintenance such as upgrades, restarts, or configuration changes can be made on a per-instance basis, thereby not affecting other business units.

In a multi-tenant environment such as a managed service provider (MSP), there are the following additional benefits:

  • Higher tiers of hardware with low tenant to resource ratios and highly redundant configurations are cost effective and can be made available as a premium service.
  • Each instance's data store can be kept on different partitions, allowing higher volume instances to reside on faster (more expensive) storage while lower tier storage can be leveraged for lower volume instances.
Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk