Add Cyrilyc charracters to unicode.mapping
When using ModSecurity to protect sites written in non-english language a unicode mapping is required to translate characters to ascii (latin) equivalents. This is used in different places.
Most commonly (at least in my scenario) this falls on it heads when decoding strings and evaluating them for sql injection and similar nastiness.
If the mapping is incomplete (as currently is) the input is decoded to some garbage which triggers sql injection alert
Sample:
name1=%D0%B4%D0%B8%D0%BC%D0%B8%D1%82%D1%80%D0%BE%D0%B2 is decoded as name1: \\\\\\\\x135>@3852\ which strangely enough matches 1ov in libinjection (libinjection is not point of discussion here).
The issue is exactly the same as:
- https://github.com/SpiderLabs/owasp-modsecurity-crs/issues/794
- https://github.com/SpiderLabs/ModSecurity/issues/348
- https://github.com/SpiderLabs/ModSecurity/issues/1601
The following pull request adds mapping for some cyrilyc characters (at least for my use case). More speciffically unicode range 0x0410 - 0x44f. Attempt was made to follow as best transliteration rules, but as some glyphs are transliterated to multiple characters (e.g. Щ == SHT) some decissions have been made.
This has been tested and is currently being used in production.
Versions:
- ModSecurity: 2.9.2 (ubuntu 18.04 packaged)
- CRS: 3.0.2