Community:How to mask strings in json event at Indexing time when using INDEXED EXTRACTION

From Splunk Wiki

Jump to: navigation, search

Splunk can mask strings in events at indexing time by making use of SEDCMD and/or TRANSFORMS attributes in props.conf. When it comes events proccessed with INDEXED_EXTRACTION, we need additional consideration because we need to mask both _raw and _meta(Indexed keyes) separately. In this page, we test how we can achieve masking string using a sample json event which contains password field.

Sample Event
{ "username" : "my_username", "password" : "my_password - password", "validation-factors" : { "validationFactors" : [ { "name" : "remote_address", "value" : "127.0.0.1" } ] }, "@timestamp": "2018-01-05T14:56:29.000Z", "attributes": { "field_1": "value_1", "field_2": "value_2", "field_3": "value_3" } }

(Sample Event with pretty print format)

----------------------------------------------
{
   "username" : "my_username",
   "password" : "my_password - password",
   "validation-factors" : {
      "validationFactors" : [
         {
            "name" : "remote_address",
            "value" : "127.0.0.1"
         }
      ]
   },
   "@timestamp": "2018-01-05T14:56:29.000Z",
   "attributes": {
      "field_1": "value_1",
      "field_2": "value_2",
      "field_3": "value_3"
   }
}
----------------------------------------------

Default Configuration without masking

- props.conf
[test_json_password]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = @timestamp

Community wiki masking password INDEXED EXTRACTION original.png



Test 01: SEDCMD or TRASNFFORMS for masking raw events

- props.conf
[test_json_password]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = @timestamp
SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/

Community wiki masking password INDEXED EXTRACTION only raw.png



Test 02: SEDCMD for _raw and TRASNFFORMS for _meta(Successfully achieved what we wanted here)

- props.conf
[test_json_password]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = @timestamp
SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/
TRANSFORMS-mask_json_password = mask_json_password_meta
KV_MODE = none

- transforms.conf
[mask_json_password_meta]
SOURCE_KEY = _meta
DEST_KEY = _meta
REGEX = ^(.*message[tT]ext::)\S+ - password" (.*)
FORMAT = $1"###### - password" $2
WRITE_META = false

#The following stanza does the same job as the SEDCMD in props.conf
[mask_json_password_raw]
DEST_KEY = _raw
REGEX = ^(.*messageText": ")\S+( - password.*)$
FORMAT = $1######$2

Community wiki masking password INDEXED EXTRACTION both raw and meta.png



Test 03: Furtuer testing

Test 03-01: Confusing Result: SEDCMD for _raw and TRASNFFORMS for _meta with KV_MODE=json

- props.conf
[test_json_password]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = @timestamp
SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/
TRANSFORMS-mask_json_password = mask_json_password_meta
#KV_MODE = none

- transforms.conf
[mask_json_password_meta]
SOURCE_KEY = _meta
DEST_KEY = _meta
REGEX = ^(.*message[tT]ext::)\S+ - password" (.*)
FORMAT = $1"###### - password" $2
WRITE_META = false

Community wiki masking password INDEXED EXTRACTION duplicate field extraction.png



Test 03-02: Confusing Result: Missing fields after password field because of mis-formating

- props.conf
[test_json_password]
INDEXED_EXTRACTIONS = json
TIMESTAMP_FIELDS = @timestamp
TRANSFORMS-mask_json_password = mask_password_missing_double_quote
KV_MODE = none

- transforms.conf
[mask_password_missing_double_quote]
SOURCE_KEY = _meta
DEST_KEY = _meta
REGEX = (.*password::)\S+ - password(.*)
FORMAT = $1"###### - password" $2
WRITE_META = false

Community wiki masking password INDEXED EXTRACTION missing some field extractions.png

Personal tools
Hot Wiki Topics


About Splunk >
  • Search and navigate IT data from applications, servers and network devices in real-time.
  • Download Splunk