The OWASP CRS Sampling Mode


The OWASP ModSecurity Core Rule Set Sampling Mode has been around for several years, but it’s rarely used and that’s probably because it’s not advertised enough. Let’s change that!

When you want to deploy ModSecurity and CRS on an existing, but poorly defended service for the first time, then you do not really know what is going to happen. A well-developed test environment can help, but rare is the installation where you can reproduce real-world traffic 1:1 in the lab.

So it’s like a jump into murky water: Potentially very painful.

That’s bad and that’s why CRS3 introduced the Sampling Mode.

With the sampling mode you can run CRS on a limited percentage of the traffic. The remainder of the traffic will bypass the rule set. So if ModSecurity is out to kill you or if the rules are too heavy for your server, then it’s only like 1% of the traffic that is going to be affected. And that is unlikely to hurt very much, especially when you keep an eye on the logs and roll back the deployment as soon as the alerts pile up.

There is little security in the Sampling Mode apparently, but the idea is to go from 1% to 2, to 5, to 10, 20, 50 and ultimately 100% where the rules are applied to the entire traffic. Needless to say the default sampling rate of CRS is 100%. So you do not need to worry unless you really want to use sampling.

Applying the Sampling Mode

The Sampling Mode or the sampling rate to be exact are defined in the crs-setup.conf configuration file. Look for the rule id 900400.

Uncomment this rule and set the variable tx.sampling_percentage to the desired value. For an initial test in the lab, 50 is a useful value.

Then reload your server and do a few requests, ideally with a payload resembling an exploit. Here is what I usually run:

$ curl -v http://localhost/index.html?test=/etc/passwd

I actually have an alias “curlattack” that runs this for me.

If the rule set is applied (and the anomaly threshold is 10 or lower), you will get a 403, Forbidden since the request triggers two critical rules by default. If the Sampling Mode kicks in and the rule set is bypassed, then you will get a proper response with a status 200.

If you look at error log for the latter case, you will find an alert like the following one:

[2021-05-20 09:17:29.861751] [-:error] 127.0.0.1:34804 YKYNCTvxCfzoId8rh4WdtAAAAAE [client 127.0.0.1] ModSecurity: Warning. Match of "lt %{tx.sampling_percentage}" against "TX:sampling_rnd100" required. [file "/home/dune73/data/git/crs-official/rules/REQUEST-901-INITIALIZATION.conf"] [line "443"] [id "901450"] [msg "Sampling: Disable the rule engine based on sampling_percentage 50 and random number 78"] [ver "OWASP_CRS/3.3.0"] [hostname "localhost"] [uri "/index.html"] [unique_id "YKYNCTvxCfzoId8rh4WdtAAAAAE"]

So CRS reports that it disables the rule engine because the random number was above the sampling limit. It’s simple: You set the sampling percentage, the rule set generates an integer random number in the 0-99 range and if it’s above the percentage, then the WAF is disabled for the remainder of the request. So if you have other rule sets installed, keep in mind they will be bypassed too.

For requests where the rule set is bypassed, you get a log entry from 901450. For the other requests, those without 901450 entry, the rule set is applied normally.

If you think 901450 is spamming you or you think this is already too heavy on your logs, then you can apply the following directive after the CRS include statement or in RESPONSE-999-EXCLUSION-RULES-AFTER-CRS.conf in order to silence the rule.

SecRuleUpdateActionById 901450 "nolog"

Rollback

What if you deploy CRS with the Sampling Mode on a large service, define a low sampling rate, but then the logs pile up and you want to disable CRS completely? You can roll back the deployment, but there is an even quicker method: Define tx.sampling_percentage to be 0 and every request will skip the WAF. After you reload the webserver configuration of course. So ModSecurity and CRS are now installed and ready to fly, but you have disabled it completely.

That’s all there is to know to run and apply the Sampling Mode.

I do not know how many people are actually using sampling from time to time. If you do, then I’d be very happy to get a note – and maybe some feedback. Drop me a mail or find my on twitter at @ChrFolini, DMs are open.

If you are curious how this is done internally, then read on.

How to Get the Random Numbers to Do the Sampling?

This would be easy if ModSecurity would give us random numbers, perhaps in a variable $RANDOM like the bash shell does. Unfortunately, that is wishful thinking, so CRS has to fish for entropy itself. When I created the Sampling Mode functionality, I thought about this problem a long time. Then it hit me: The UNIQUE_ID that identifies a request with a absolutely unique primary key has a random element in it! This is the entropy we need. So rule 901410 hashes the unique id and encodes the result in hex. And then we take the first two digits out of this number to get a random number from 0 to 99. In the extremely rare case where the hex encoded hash does not contain a digit, there is a fallback routine that takes the last digits of the DURATION variable to compensate. But maybe we can just skip that step since it’s so incredibly rare. That discussion is currently going on.

Admittedly, this is not cryptography level randomness. But it’s good enough for our sampling needs.