# Community:Plotting a linear trendline

Did you know you can do a linear trendline (regression) within Splunk? While Splunk already comes with a built-in search command for doing trendlines based on moving averages (see the "trendline" command), you can also do more complex computations, such as a linear regression using search commands such as 'eventstats' and 'eval'.

Additionally, you can use search macros to mask the complexity of the underlying search.

### Step 1: Creating the macro

1. Create a search macro named "lineartrend(2)" via the Splunk Manager user interface. Official documentation on search macros are here:

2. The macro definition should be:

```eventstats count as numevents sum(\$x\$) as sumX sum(\$y\$) as sumY sum(eval(\$x\$*\$y\$)) as sumXY sum(eval(\$x\$*\$x\$)) as sumX2 sum(eval(\$y\$*\$y\$)) as sumY2
| eval slope=((numevents*sumXY)-(sumX*sumY))/((numevents*sumX2)-(sumX*sumX))
| eval yintercept= (sumY-(slope*sumX))/numevents
| eval newY=(yintercept + (slope*\$x\$))
| eval R=((numevents*sumXY) - (sumX*sumY))/sqrt(((numevents*sumX2)-(sumX*sumX))* ((numevents*sumY2)-(sumY*sumY)))
| eval R2=R*R
```

Note: When copy/pasting the macro definition, make sure you paste it into Splunk as a single line. You will get errors otherwise.

3. Macro arguments should be:

`x, y`

### Step 2: Plotting the trendline

The macro `lineartrend` gets passed two arguments, x and y values from each event, and creates the following fields for each event:

• slope - the slope of the trendline,
• yintercept - the y-intercept of the trendline
• R - coefficient of correlation
• R2- R-squared, the coefficent of determination
• newY - the trendline values

For interested readers, the calculation was borrowed from this web site: http://easycalculation.com/statistics/learn-regression.php

You can quickly test this on time series data with the Y-value of the chart being simple event counts. Run the following search from the Search App, and then click "Build Report":

<search>sourcetype=my_data | timechart count as yvalue | `lineartrend(_time,yvalue)` | timechart sum(yvalue) sum(newY)</search>

The first timechart generates the sample source data (a plot of event counts over time) which is then piped to the macro 'lineartrend' which generates the new y axis values of the trend. The second timechart stanza plots both the original values and the trendline on a chart, which you can visualize by clicking "Build report"

### Step 3: Getting the correlation coefficient (R) or coefficient of determination (R-squared)

The above macro also calculates R and R-square for the above chart automatically. To display R-squared only (for example, to put it below your chart on a dashboard), run the following search:

`sourcetype=my_data | timechart count as yvalue | `lineartrend(_time,yvalue)` | stats first(R2)`

### Sample dashboard

Enjoy!

Gaurav 19:41, 7 December 2009 (PST)