New to Google SecOps: Using Metrics in YARA-L Rules (Part 2)

jstoner · 04-03-2024 08:00 AM

Last time, we introduced the concept of metrics at a high level and looked at the composition of a metric function in the outcome section of a YARA-L rule. We wrapped up with a very light rule that demonstrated how the placeholder variable in the event section is leveraged in the metric function as well as calculating the maximum byte count for a 30 day period for a specific IP address. If you haven’t read this blog yet, please take a few minutes because we laid the foundation for this post in the previous one, I’ll wait.

We are going to continue exploring the additional aggregation options available within the metric functions. Suppose we wanted to understand the following statistical measures for outbound byte count over the 30 day window:

Average
Standard Deviation
Minimum
Maximum
Total Sum (for the entire window)
Number of Periods within the Window with a Non-Zero value

Let’s build on the rule we used in our previous example and see how easy it is to capture these additional statistical measures. Our events, match and condition sections in our YARA-L rule haven’t changed. The only modification is that we added metric functions in the outcome section for the aggregations in the bulleted list above.

rule metric_examples_network {

 meta:
   author = "Google Cloud Security"

 events:
   $net.metadata.event_type = "NETWORK_CONNECTION"
   $net.network.sent_bytes > 0
   $net.principal.ip = $ip
   $net.principal.ip = "10.128.0.21"

 match:
   $ip over 1d

 outcome:
   $max_byte_count_window = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:max,
       principal.asset.ip: $ip
   ))
   $min_byte_count_window = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:min,
       principal.asset.ip: $ip
   ))
   $avg_byte_count = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:avg,
       principal.asset.ip: $ip
   ))
   $stddev_byte_count = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:stddev,
       principal.asset.ip: $ip
   ))
   $sum_byte_count_window = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:sum,
       principal.asset.ip: $ip
   ))
   $byte_count_days_seen = max(metrics.network_bytes_outbound(
       period:1d, window:30d,
       metric:value_sum,
       agg:num_metric_periods,
       principal.asset.ip: $ip
   ))

 condition:
   $net
}

When we test our rule we can see the metrics for each day for our IP address. Notice that each day, the 30 day window is continually rolling forward, which is why we see different values for average, standard deviation and the sum.

While we’ve done a good deal of data calculation, we haven’t really detected anything. Let’s change that now. We are going to use metrics to build an outlier detection based on outbound network bytes.

I’ll be the first one to admit that my lab environment and its data set has a wide variance in outbound bytes which results in a very high standard deviation, so for the purpose of this example, we’re going to bypass that metric and instead detect when the daily sum of sent bytes is greater than five times the average of the 30 day window. In our example above, our maximum bytes sent per day was 28+ GB and our daily average was around 2.6 GB, so basically we want to identify any day when outbound traffic was around 50 percent of the 30 day maximum.

For demonstration purposes, we are going to continue to isolate on a single IP address.

rule metric_examples_network {

meta:
  author = "Google Cloud Security"

events:
  $net.metadata.event_type = "NETWORK_CONNECTION"
  $net.network.sent_bytes > 0
  $net.principal.ip = $ip
  $net.principal.ip = "10.128.0.21"

match:
  $ip over 1d

outcome:
  $max_byte_count_window = max(metrics.network_bytes_outbound(
      period:1d, window:30d,
      metric:value_sum,
      agg:max,
      principal.asset.ip: $ip
  ))
  $avg_byte_count = max(metrics.network_bytes_outbound(
      period:1d, window:30d,
      metric:value_sum,
      agg:avg,
      principal.asset.ip: $ip
  ))
  $avg_byte_count_x5 = $avg_byte_count * 5
  $daily_byte_count = sum($net.network.sent_bytes)

condition:
  $net and $daily_byte_count > $avg_byte_count_x5
}

We are using the metric function to calculate average and maximum. The additional two outcome variables are five times the value of the average byte count that we calculated with our metric and the $daily_byte_count, which is the sum of all network.sent_bytes for the period within the match window, that is one day. This last outcome variable is the same thing we’ve been doing with aggregation functions in outcome variables, so that last piece is nothing new.

Once we have these values, we are going to add an additional condition to our, well, condition section. This condition is that we want the outcome variable $daily_byte_count to be greater than our outcome variable of $avg_byte_count_x5.

When we test our rule, we get one day, March 6, matching this condition. We’ve outputted the values in our detection, but sometimes, it’s nice to have commas separating the value for easier reading, so I’ve done that here for reference.

Daily Byte Count: 26,963,260,689
5x Average Byte Count: 13,262,014,413 (rounded to the nearest byte)
Average 30 Day Byte Count: 2,652,402,883 (rounded to the nearest byte)
Max Byte Count for 30 Day Period: 28,461,718,812

Here we can see that on March 6, the daily outbound bytes (sent) were nearly 27 GB, which is just under the 30 day maximum of 28.4 GB and clearly more than 5 times more than the 30 day average of 2.6 GB. Based on this, we might want to look into why this specific host is sending so much data outbound.

Today we’ve broadened our use of the metrics capability by introducing additional aggregations to our metric function. We’ve also taken UDM fields and compared them to the 30 day aggregations to make decisions on which days our outbound network traffic was anomalous. Keep in mind that while we have been focused on just the outbound network byte metric, there are also inbound and total network byte metrics to work with and that doesn’t even address all of the other metrics available to us for DNS, authentication, file execution and more.

In our next blog on metrics, we will dig deeper into another portion of the metric function that allows us to not only tally sums of bytes, but perform additional capabilities too!