Community:How to mask strings in json event at Indexing time when using INDEXED EXTRACTION
From Splunk Wiki
Splunk can mask strings in events at indexing time by making use of SEDCMD and/or TRANSFORMS attributes in props.conf. When it comes events proccessed with INDEXED_EXTRACTION, we need additional consideration because we need to mask both _raw and _meta(Indexed keyes) separately. In this page, we test how we can achieve masking string using a sample json event which contains password field.
Sample Event
{ "username" : "my_username", "password" : "my_password - password", "validation-factors" : { "validationFactors" : [ { "name" : "remote_address", "value" : "127.0.0.1" } ] }, "@timestamp": "2018-01-05T14:56:29.000Z", "attributes": { "field_1": "value_1", "field_2": "value_2", "field_3": "value_3" } }
(Sample Event with pretty print format)
---------------------------------------------- { "username" : "my_username", "password" : "my_password - password", "validation-factors" : { "validationFactors" : [ { "name" : "remote_address", "value" : "127.0.0.1" } ] }, "@timestamp": "2018-01-05T14:56:29.000Z", "attributes": { "field_1": "value_1", "field_2": "value_2", "field_3": "value_3" } } ----------------------------------------------
Default Configuration without masking
- props.conf [test_json_password] INDEXED_EXTRACTIONS = json TIMESTAMP_FIELDS = @timestamp
Test 01: SEDCMD or TRASNFFORMS for masking raw events
- props.conf [test_json_password] INDEXED_EXTRACTIONS = json TIMESTAMP_FIELDS = @timestamp SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/
Test 02: SEDCMD for _raw and TRASNFFORMS for _meta(Successfully achieved what we wanted here)
- props.conf [test_json_password] INDEXED_EXTRACTIONS = json TIMESTAMP_FIELDS = @timestamp SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/ TRANSFORMS-mask_json_password = mask_json_password_meta KV_MODE = none - transforms.conf [mask_json_password_meta] SOURCE_KEY = _meta DEST_KEY = _meta REGEX = ^(.*message[tT]ext::)\S+ - password" (.*) FORMAT = $1"###### - password" $2 WRITE_META = false #The following stanza does the same job as the SEDCMD in props.conf [mask_json_password_raw] DEST_KEY = _raw REGEX = ^(.*messageText": ")\S+( - password.*)$ FORMAT = $1######$2
Test 03: Furtuer testing
Test 03-01: Confusing Result: SEDCMD for _raw and TRASNFFORMS for _meta with KV_MODE=json
- props.conf [test_json_password] INDEXED_EXTRACTIONS = json TIMESTAMP_FIELDS = @timestamp SEDCMD-mask_password_raw = s/\S+( - password)/"######\1/ TRANSFORMS-mask_json_password = mask_json_password_meta #KV_MODE = none - transforms.conf [mask_json_password_meta] SOURCE_KEY = _meta DEST_KEY = _meta REGEX = ^(.*message[tT]ext::)\S+ - password" (.*) FORMAT = $1"###### - password" $2 WRITE_META = false
Test 03-02: Confusing Result: Missing fields after password field because of mis-formating
- props.conf [test_json_password] INDEXED_EXTRACTIONS = json TIMESTAMP_FIELDS = @timestamp TRANSFORMS-mask_json_password = mask_password_missing_double_quote KV_MODE = none - transforms.conf [mask_password_missing_double_quote] SOURCE_KEY = _meta DEST_KEY = _meta REGEX = (.*password::)\S+ - password(.*) FORMAT = $1"###### - password" $2 WRITE_META = false