Community:Credit card masking regex

From Splunk Wiki

Jump to: navigation, search

TA using Luhn algorithm

Update 9/1/2016: The TA-Luhn add-on (https://splunkbase.splunk.com/app/2753/) will use the Luhn algorithm to detect credit card numbers in your data. It's a more robust method than the regex below. This works for Splunk 6.

Useful regex for masking credit card numbers in your data

Courtesy of pde23 from the Splunk Forums: http://www.splunk.com/support/forum:SplunkAdministration/3352/10921


I've seen more than one question about how to mask credit card numbers at index time. Most of the answers seem to be pegged to specific patterns found in specific logs that folks have.

Not trusting developers to log things according to spec, I wanted to have a more comprehensive solution in hand that would catch CC numbers wherever they might appear. This transform seems to do the trick, catching VISA, MasterCard, AmEx, Diner's Club, Discover, and JCB in all their variants, leaving the last four digits unmasked for clarity.

This catches CC numbers wherever they may appear (as long as they start on a word boundary...) in XXXXXXXXXXXXXXXX format. Catching XXXX-XXXX-XXXX-XXXX or XXXX XXXX XXXX XXXX while still preserving the last four digits is a substantially more complicated regex. Haven't quite worked out all the kinks in that one yet...

Hope this is helpful to someone.

[creditcard-anonymizer]
REGEX=(?ms)(.*)\b(?:4[0-9]{8}(?:[0-9]{3})?|5[1-5][0-9]{10}|6(?:011|5[0-9]{2})[0-9]{8}|3[47][0-9]{9}|3(?:0[0-5]|[68][0-9])[0-9]{7}|(?:2131|1800|35\d{3})\d{7})(\d{4}\b.*)
FORMAT= $1###SCRUBBED###$2
DEST_KEY = _raw

Refer to transforms.conf.spec for details on how to implement this.


In Splunk 4

In Splunk 4 it is easier to make this configuration using the SEDCMD directive in props.conf. To scrub a particular sourcetype, add the following to props.conf:

[sourcetype-to-scrub]
SEDCMD-ccmask = s/\b(?:4[0-9]{8}(?:[0-9]{3})?|5[1-5][0-9]{10}|6(?:011|5[0-9]{2})[0-9]{8}|3[47][0-9]{9}|3(?:0[0-5]|[68][0-9])[0-9]{7}|(?:2131|1800|35\d{3})\d{7})(\d{4}\b.*)/###SCRUBBED###\1/g                                             

If all data should be scrubbed, replace "sourcetype-to-scrub" with "default" and this configuration will apply globally.

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk