Kubernetes textPayload Split

Hi

Within the Kubernetes Node parser, I am trying to split the textPayload into separate fields. The textPayload field contains long text which we're trying to extract and split the field into a key then the output of the field to a value. An example of a raw log (data nullified):

"textPayload": "time\u003d\"0000-00-00T00:00:00.0000000Z\" type\u003d\"container_app_firewall_audit\" container_name\u003d\"container-name-here\" image_name\u003d\"image-name/here:latest\"

How can I assign lets say for example time and type to its own separate UDM field OR automatically add an array with key and value pairs? I've configured extension parser as shown below here, however it's not splitting the values by spaces.

 

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => " "
      target => "textPayload_array"
    }
  }
  mutate {
    merge => {
      "event.idm.read_only_udm.target.description" => "textPayload_array"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 

 

0 5 87
5 REPLIES 5

Hi @ad9001 

There is no target.description field in UDM schema, you may use metadata.description instead.

Split function might may give you output in textPayload_array and considering no keys are given, consider checking by accessing values using fields like textPayload_array.0, textPayload_array.1 and so on. 

Hi @s_shubh 

Thank you for the suggestion, I tried adding path but had no luck. I tried adding json with array function for split column as well but this time I get an error message:

generic::unknown: pipeline.ParseLogEntry failed: LOG_PARSING_CBN_ERROR: "generic::invalid_argument: failed to convert raw output to events: failed to convert raw message 0: field \"idm\": index 0: recursive rawDataToProto failed: field \"read_only_udm\": index 0: recursive rawDataToProto failed: field \"metadata\": index 0: recursive rawDataToProto failed: panic encountered: non-string given for backstory.Metadata.description: []interface {} []interface {}{\"type=\\\"container_app_firewall_audit\"}"

 

CBN Snippet:

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }
    json {
        source => "message"
        array_function => "split_columns"
        on_error => "_not_json"
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => "\" "
      target => "textPayload_array"
    }
  }a
  mutate {
    merge => {
      "event.idm.read_only_udm.metadata.description" => "textPayload_array.1"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 Based on the statedump, I do see the following:

 "textPayload_array": {
    "0": "time=\"0000-00-00T00:00:00.0000Z",
    "1": "type=\"container_app_firewall_audit",
    "10": "source_ip=\"0.0.0.0",
    "11": "request_method=\"GET",
    "12": "request_user_agents=\"Go-http-client/1.1",
    "13": "request_host=\"0.0.0.0:0000",

In the first mutate block under if condition, I observed letter "a". Could you please try to remove it and run, if it's not a typo ? 

Also try using replace for description instead merge in mutate. 

 

 

Hi @s_shubh 

Adding a replace instead of a merge no longer shows error message however only shows the field.

UDM Output:

 

metadata.description: "textPayload_array.1"

 



Is there a better way to split and add all the arrays into their own fields or field of arrays? I've include more info which may help from the UDM Output Error, Statedump, and Raw log (sanitized).

UDM Output Error:

 

generic::unknown: pipeline.ParseLogEntry failed: LOG_PARSING_CBN_ERROR: "generic::invalid_argument: failed to convert raw output to events: failed to convert raw message 0: field \"idm\": index 0: recursive rawDataToProto failed: field \"read_only_udm\": index 0: recursive rawDataToProto failed: field \"metadata\": index 0: recursive rawDataToProto failed: panic encountered: non-string given for backstory.Metadata.description: []interface {} []interface {}{\"type=\\\"container_app_firewall_audit\"}"

 

 State-dump (sanitized):

 

Internal State (label=foo):

{
  "@createTimestamp": {
    "nanos": 0,
    "seconds": 1715722907
  },
  "@enableCbnForLoop": true,
  "@onErrorCount": 0,
  "@output": [
    {
      "idm": {
        "read_only_udm": {
          "metadata": {
            "description": [
              "type=\"container_app_firewall_audit"
            ]
          }
        }
      }
    }
  ],
  "@timezone": "",
  "_not_json": false,
  "event": {
    "idm": {
      "read_only_udm": {
        "metadata": {
          "description": [
            "type=\"container_app_firewall_audit"
          ]
        }
      }
    }
  },
  "insertId": "id-here",
  "labels": {
    "compute": {
      "googleapis": {
        "com/resource_name": "resource-name-here"
      }
    },
    "k8s-pod/app": "app-here",
    "k8s-pod/controller-revision-hash": "hash-here",
    "k8s-pod/pod-template-generation": "1"
  },
  "logName": "log-name",
  "message": "{\n  \"textPayload\": \"time\\u003d\\\"2024-05-04Z\\\" type\\u003d\\\"container_app_firewall_audit\\\" container_id\\u003d\\\"container-id-here\\\" container_name\\u003d\\\"container-name-here\\\" image_name\\u003d\\\"image-name-here\\\" hostname\\u003d\\\"hostname-here\\\" effect\\u003d\\\"prevent\\\" msg\\u003d\\\"Detected Code Injection attack in request body parameter \\\\\\\"bsh.script\\\\\\\", match exec(\\\\\\\"ipconfig\\\\\\\"), value exec(\\\\\\\"ipconfig\\\\\\\"), injection language: php\\\" log_type\\u003d\\\"codeInjection\\\" source_ip\\u003d\\\"0.0.0.0\\\" source_country\\u003d\\\"country\\\" connecting_ips\\u003d\\\"0.0.0.0\\\" request_method\\u003d\\\"POST\\\" request_user_agents\\u003d\\\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\\\" request_host\\u003d\\\"request-host-here.com\\\" request_url\\u003d\\\"request.url.here.com\\\" request_path\\u003d\\\"/path/is/here\\\" request_header_names\\u003d\\\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-info-here-Parent-Id,X-app-info-here-Sampling-Priority,X-app-info-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\\\" cluster\\u003d\\\"cluster-name-here\\\" attack_techniques\\u003d\\\"exploit-here\\\" rule_name\\u003d\\\"rule-name\\\" rule_app_id\\u003d\\\"name-of-rule-app\\\" protection\\u003d\\\"firewall\\\" attack_field_type\\u003d\\\"formBody\\\" attack_field_key\\u003d\\\"bsh.script\\\" attack_field_value\\u003d\\\"exec(\\\"ipconfig\\\")\\\" event_id\\u003d\\\"event-id\\\"\",\n  \"insertId\": \"id-is-here\",\n  \"resource\": {\n    \"type\": \"k8s_container\",\n    \"labels\": {\n      \"container_name\": \"container name\",\n      \"project_id\": \"project_id\",\n      \"namespace_name\": \"namespacehere\",\n      \"pod_name\": \"pod-name-here\",\n      \"cluster_name\": \"cluster-name-here\",\n      \"location\": \"location-here\"\n    }\n  },\n  \"timestamp\": \"2024-04-04\",\n  \"severity\": \"INFO\",\n  \"labels\": {\n    \"api-here/resource_name\": \"resource-name\",\n    \"k8s-pod/controller-revision-hash\": \"controller-here\",\n    \"k8s-pod/pod-here\": \"1\",\n    \"k8s-pod/app\": \"podapp\"\n  },\n  \"logName\": \"projectshere\",\n  \"receiveTimestamp\": \"2024-05-04\"\n}",
  "receiveTimestamp": "2024-05-04Z",
  "resource": {
    "labels": {
      "cluster_name": "cluster_name",
      "container_name": "conatiner-name",
      "location": "thelocation",
      "namespace_name": "namespacehere",
      "pod_name": "podnamehere",
      "project_id": "projectidt"
    },
    "type": "k8s_container"
  },
  "severity": "INFO",
  "textPayload": "time=\"2024-05-04TZ\" type=\"container_app_firewall_audit\" container_id=\"idhere\" container_name=\"namehere\" image_name=\"uimagename\" hostname=\"hostnamehere\" effect=\"prevent\" msg=\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php\" log_type=\"codeInjection\" source_ip=\"0.0.0.0\" source_country=\"countryhere\" connecting_ips=\"0.0.0.0\" request_method=\"POST\" request_user_agents=\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\" request_host=\"hosturlhere\" request_url=\"requesturlhere.com/path/here\" request_path=\"/path/here\" request_header_names=\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-appnamehere-Parent-Id,X-appnamehere-Sampling-Priority,X-appnamehere-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\" cluster=\"clusternamehere\" attack_techniques=\"exploit\" rule_name=\"rulenamehere\" rule_app_id=\"appnamehere\" protection=\"firewall\" attack_field_type=\"formBody\" attack_field_key=\"bsh.script\" attack_field_value=\"exec(\"ipconfig\")\" event_id=\"event-id8\"",
  "textPayload_array": {
    "0": "time=\"2024-05-04",
    "1": "type=\"container_app_firewall_audit",
    "10": "source_country=\"country",
    "11": "connecting_ips=\"0.0.0.0,1.2.3.4",
    "12": "request_method=\"POST",
    "13": "request_user_agents=\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7",
    "14": "request_host=\"request.host.here.com",
    "15": "request_url=\"request.url.here.com/path/here",
    "16": "request_path=\"/path/here",
    "17": "request_header_names=\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-here-Parent-Id,X-app-here-Sampling-Priority,X-app-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id",
    "18": "cluster=\"cluster-namehere",
    "19": "attack_techniques=\"exploit-here",
    "2": "container_id=\"container-id-here-123456789",
    "20": "rule_name=\"rule-name-here",
    "21": "rule_app_id=\"app-id-here",
    "22": "protection=\"firewall",
    "23": "attack_field_type=\"formBody",
    "24": "attack_field_key=\"bsh.script",
    "25": "attack_field_value=\"exec(\"ipconfig\")",
    "26": "event_id=\"123456-event-id-here\"",
    "3": "container_name=\"container-name",
    "4": "image_name=\"imagename-here",
    "5": "hostname=\"hostname-here",
    "6": "effect=\"prevent",
    "7": "msg=\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php",
    "8": "log_type=\"codeInjection",
    "9": "source_ip=\"0.0.0.0"
  },
  "timestamp": "2024-05-04"
}

 

Raw Log (Sanitized):

 

{
  "textPayload": "time\u003d\"2024-05-04TT00:00:00.00000Z\" type\u003d\"container_app_firewall_audit\" container_id\u003d\"container_idhere\" container_name\u003d\"containernamehere\" image_name\u003d\"imagenamehere\" hostname\u003d\"hostname\" effect\u003d\"prevent\" msg\u003d\"Detected Code Injection attack in request body parameter \\\"bsh.script\\\", match exec(\\\"ipconfig\\\"), value exec(\\\"ipconfig\\\"), injection language: php\" log_type\u003d\"codeInjection\" source_ip\u003d\"0.0.0.0\" source_country\u003d\"country\" connecting_ips\u003d\"0.0.0.0\" request_method\u003d\"POST\" request_user_agents\u003d\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.7 (KHTML, like Gecko) Version/9.1.2 Safari/601.2.7\" request_host\u003d\"requiresturlhere.com\" request_url\u003d\"requesthrul.com/path/here\" request_path\u003d\"/path/here\" request_header_names\u003d\"Accept-Encoding,Content-Length,Content-Type,User-Agent,Via,X-Cloud-Trace-Context,X-app-name-here-Parent-Id,X-app-name-here-Sampling-Priority,X-app-name-here-Trace-Id,X-Envoy-Attempt-Count,X-Envoy-External-Address,X-Envoy-Original-Path,X-Forwarded-For,X-Forwarded-Port,X-Forwarded-Proto,X-Request-Id\" cluster\u003d\"projectnamehere\" attack_techniques\u003d\"exploit\" rule_name\u003d\"rulename\" rule_app_id\u003d\"app-name\" protection\u003d\"firewall\" attack_field_type\u003d\"formBody\" attack_field_key\u003d\"scriptt\" attack_field_value\u003d\"exec(\"ipconfig\")\" event_id\u003d\"eventidhere\"",
  "insertId": "id_here",
  "resource": {
    "type": "k8s_container",
    "labels": {
      "container_name": "twistlock-defender",
      "project_id": "projectnamehere",
      "namespace_name": "prismacloud",
      "pod_name": "podnamehere",
      "cluster_name": "clusternamehere",
      "location": "location"
    }
  },
  "timestamp": "2024-05-04TT00:00:00.00000Z",
  "severity": "INFO",
  "labels": {
    "compute.googleapis.com/resource_name": "resourcenamehere",
    "k8s-pod/controller-revision-hash": "hash_id",
    "k8s-pod/pod-template-generation": "1",
    "k8s-pod/app": "podname"
  },
  "logName": "projects/projectnamehere/logs/stdout",
  "receiveTimestamp": "2024-05-04TT00:00:00.00000Z"
}

 

Parser Extension:

 

filter{
    mutate {
        replace => {
            "textPayload" => ""
        }
    }
    json {
        source => "message"
        array_function => "split_columns"
        on_error => "_not_json"
    }

if [textPayload] != "" 
{

  mutate {
    split => {
      source => "textPayload"
      separator => "\" "
      target => "textPayload_array"
    }
  }
  mutate {
    merge => {
      "event.idm.read_only_udm.metadata.description" => "textPayload_array.1"
    }
  }
  mutate {
  merge => {
    "@output" => "event"
 }
}
}
statedump {
  label => "foo"
}
}

 



Hi,

Based on previous comments, I can see that "textPayload" contains multiple values; it's a list. Lists need to be mapped a to a repeated UDM field assuming you'd like to map all the contents of the list and you must use a for loop. I would suggest adding this data to the "additional" UDM field.