From Splunk Wiki

Jump to: navigation, search

How do I troubleshoot Splunk?

splunkd.log is your best friend

Everything Splunk does or fails to do is recorded in $SPLUNK_HOME/var/log/splunk/splunkd.log. When Splunk Support asks you to upload a diag, more often than not we go to this file first to see what ERROR/WARN messages are being produced. Very often, we can quickly spot what the problem is, and even identify other problems you didn't even know were hurting you.

This file is one of many internal log files that is indexed in the _internal index. That means that you don't even have to look at the raw log file, you can search it directly from your Splunk instance. Run index=_internal ERROR OR WARN as a search and see what comes back.

Every internal processor writes to splunkd.log. If you're having trouble with the deployment server feature, search for events containing "deployment", if your having trouble with TCP connections, search for "TCP", if you're having trouble in inputs, search for "input" - are you seeing a pattern yet?

If you can't spot anything easily, try DEBUG logs. If you start Splunk in DEBUG mode there will be a ton of extra information in splunkd.log (./splunk start --debug). Extra information can be a good thing.

Configuration file check

Check, check, and double check your config files that either you or Splunk has updated. The *.conf files are case-sensitive, and a stanza in one app may be overwriting a stanza in annother app. A file like inputs.conf might exist in 10 different locations and they may not be getting layered correctly. Check all settings and values against the online configuration file reference in the Admin manual.

There are also a lot of settings in the .conf files that aren't exposed in the UI. DO NOT USE THESE UNLESS YOU KNOW WHAT THEY WILL DO. Making arbitrary updates in the hope that it will improve/fix something is a surefire way to break your Splunk instance.

What changed?

Your Splunk instance was working yesterday and now it's not, what changed? Something had to change - whether it was within Splunk or your environment, something has been altered. Are you the only one with permissions to access Splunk? Someone else may have inadvertently made a bad configuration change. Was there a firewall change? How about the server itself? New software? OS update? If possible, change it back to the working config and we can go back and see what effect it had on Splunk. Remember, it's all recorded in splunkd.log

I found an error but I don't know what it means

Just ask us! File a support case online and attach your diag output. ALWAYS attach your diag output. Not only will it save us time in asking you for it and allow us to investigate straight away, but it will make us like you even more. We love our customers and want to help all of them, but there are lots of you out there and we need you to give us as much information as possible - diag output, screenshots, sample data, network diagrams - anything that will help us recreate your problem will make us love you even more.

Personal tools
Hot Wiki Topics

About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk