Community:HandlingEventsTimeStampedInTheFuture

From Splunk Wiki

Jump to: navigation, search

< Back to Best Practices

How to handle events that are timestamped in the future

Introduction

Because Splunk interprets your data as a time series index, time stamps are important for all events coming in to Splunk for indexing. Splunk looks for a time stamp on each event as it's processing it for indexing.

In some situations, systems have the incorrect time when time stamping an event. Splunk highly recommends that you use NTP to synchronize all your servers/workstations (including the Splunk server) from the same time source. This may not always be possible and there will be other factors that will contribute to incorrect event time stamping.

The worst case scenario is when your system constantly interleaves events with current time stamps and older or future time stamps. It's also important to note that the longer out of sync these events are, the worse the problem will be. 5 or 10 minutes in the future is not a real problem, 5 or 10 days will not be ideal, but won't cause as big a problem as, say, 5 or 10 years.

Side Effects

  • Perfomance - This is the primary symptom of incorrect timestamping. You'll notice that search slows down considerably if you have issues with timestamping. Search performance is affected because Splunk has to look at more data (across more buckets) before returning results to you when some of your data is timestamped into the future.
  • Retention - Events that are incorrectly timestamped as occurring in the future can cause problems with your retention settings because when it occurs, the data is no longer older than the 6 months you asked for when setting your frozen period. We will get back to why a bit later.

Prevention

  • Use NTP everywhere
    • If you can't use NTP then look at using MAX_DAYS_AGO and MAX_DAYS_HENCE in props.conf
  • Make sure every event has a time stamp
    • Define your TIME_PREFIX and TIME_FORMAT in props.conf to optimize processing and guarantee the correct results
    • Setting MAX_TIMESTAMP_LOOKAHEAD in props.conf will also help speed up the process of time processing
    • Define TZ in props.conf if required
    • Use DATETIME_CONFIG = CURRENT in props.conf if events have no time stamp or if you prefer the system to add its current time instead
    • If you have very long lines (>10K by default) look at setting TRUNCATE appropriately in props.conf

This is all you need to know about how to prevent future time stamps in your system, however please continue to read on if you are interested in further details.

Detection

How you can detect if your event time stamps are affecting your system performance and retentions settings, the following search and command will help you with this process.

  • From the UI simply copy & paste the following code and run it as any other normal search, you might have want to save this search for later use.
    • | idxprobe bucket | search dir=defaultdb* | convert ctime(earliest),ctime(latest) | eval period_days = period / 60 / 60 / 24 | eval total_size_mb = total_size / 1024 | fields + period_days,dir,earliest,latest,total_size_mb | sort period_days desc
period_days=77  | dir=defaultdb/db/db_1219429744_1212702742_3 | earliest=06/05/2008 22:52:22 | latest=08/22/2008 19:29:04 | total_size_mb=32      
period_days=11  | dir=defaultdb/db/db_1220101710_1219096801_4 | earliest=08/18/2008 23:00:01 | latest=08/30/2008 14:08:30 | total_size_mb=13      
period_days=0   | dir=defaultdb/db/db_1220101714_1220101714_5 | earliest=08/30/2008 14:08:34 | latest=08/30/2008 14:08:34 | total_size_mb=0      
period_days=363 | dir=defaultdb/db/db_1221205342_1189763469_6 | earliest=09/14/2007 10:51:09 | latest=09/12/2008 08:42:22 | total_size_mb=32      
period_days=253 | dir=defaultdb/db/db_1221602322_1199657519_7 | earliest=01/06/2008 22:11:59 | latest=09/16/2008 22:58:42 | total_size_mb=32      
  • From a Unix command line, use the following command sequence as an example to list the directory that contains your database buckets and to "translate" it via Perl to a more human readable format.
    • ls $SPLUNK_HOME/var/lib/splunk/defaultdb/db/ | perl -ne 'if ( /(^db_|tsidx$)/ ) { chop; @a = /[^\d]*(\d+)(?:-|_)(\d+)/; $span = $a[0] - $a[1]; $days = $span / 86400; $time_earliest = localtime($a[1]); $time_latest = localtime($a[0]); print "$_ -> $time_earliest to $time_latest / Span: $days\n";}'
db_1219429744_1212702742_3 -> Thu Jun  5 22:52:22 2008 to Fri Aug 22 19:29:04 2008 / Span: 77.8588194444444
db_1220101710_1219096801_4 -> Mon Aug 18 23:00:01 2008 to Sat Aug 30 14:08:30 2008 / Span: 11.6308912037037
db_1220101714_1220101714_5 -> Sat Aug 30 14:08:34 2008 to Sat Aug 30 14:08:34 2008 / Span: 0
db_1221205342_1189763469_6 -> Fri Sep 14 10:51:09 2007 to Fri Sep 12 08:42:22 2008 / Span: 363.91056712963
db_1221602322_1199657519_7 -> Sun Jan  6 22:11:59 2008 to Tue Sep 16 22:58:42 2008 / Span: 253.990775462963

NOTE: None of these methods will work correctly if your buckets haven't rolled yet, in this case a possible work-around is to add "db-hot/" to the path on the Unix ls command. Another option is to force a manual roll with the following search command: | oldsearch !++cmd++::roll

Results Analysis

Search Performance

Looking at the above data, the good news is that there doesn't seem to be any data in the future (today is Fri September 19 2008) however my data is not behaving correctly. There are several large overlaps of data in my buckets, so if I were to search for data from August 20 2008, the system would have to to look into multiple buckets: db_1219429744_1212702742_3, db_1220101710_1219096801_4, db_1221205342_1189763469_6 and db_1221602322_1199657519_7. That's 4 times more buckets than if my search was for data from September 13 2008 onwards. This will obviously impact the system's overall speed to deliver my results since it will have to search in more buckets to deliver my results.

Fortunately there's nothing wrong with my system, I'm just in the habit of importing older data at random periods of time for testing and other purposes for which I really want to preserve the original dates.

Imagine now for a moment that all of my latest dates were in say September 2010 instead. In this case every search for even the last 15 minutes would force the system to look in all of these buckets before it can return results.

Retention

Buckets roll on size. The buckets in this example are 32MB. Buckets db_1220101710_1219096801_4 and db_1220101714_1220101714_5 were forced manually to roll sooner. However, buckets are frozen based on age, so incorrect time stamps will cause your system to use more storage than initially planned.

Looking at the above example, there's data in my buckets clearly older than 180 days that still hasn't been frozen because the latest time stamp of an event (it could be just a single event) is still younger than those 180 days so we really need to keep that bucket around for a while longer.

Looking again at the scenario where buckets all contained data in September 2010, none of these buckets would be frozen before some time in February 2011. This could cause the system to quickly run out of storage space for my cold database.

Hiding incorrect time stamped events from search results

Currently it is not possible to selectively remove events after they have been indexed. You can use the delete:: command to hide them from searches, although it is better to avoid them in the first place. This should only be done from the CLI.

If this search shows the offending events:

./splunk search 'starttime::10/01/08:00:00:00' -auth admin:changeme

then this will hide them in the index:

./splunk search '| oldsearch starttime::10/01/08:00:00:00 delete::sourcetype::apache_error' -auth admin:changeme

You can specify a sourcetype or host. Deleting by source is less reliable.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk