mitmproxy icon indicating copy to clipboard operation
mitmproxy copied to clipboard

Broken unquoted regular expressions

Open cnrat opened this issue 8 months ago • 1 comments

Problem Description

A clear and concise description of what the bug is

The expression parser in mitmweb incorrectly interprets certain unquoted regular expressions when used with the ~u filter (and possibly others). For example, the filter:


~u get(Info|Routers)

is not treated as a Python-style regular expression, but instead appears to be parsed as a logical expression:


~u get AND (Info OR Routers)

This leads to incorrect matching behavior — for example, URLs like /getAttachment or /getInfographic may match unexpectedly. However, the following expression works as expected:


~u "get(Info|Routers)"

This behavior contradicts the documentation, which states that regexes are "Python-style" and that quoting is optional ("can be specified as quoted strings").


Steps to reproduce the behavior

  1. Launch mitmweb.
  2. In the filter bar, enter:

~u get(Info|Routers)

  1. Observe that unrelated URLs such as /getAttachment or /getInfoExtra may match.
  2. Change the filter expression to:

~u "get(Info|Routers)"

  1. Observe that only /getInfo or /getRouters match, as expected.

System Information

  • OS: Windows 11 (also reproducible on Linux/macOS)
  • mitmproxy version: 11.1.3
  • Installation method: pip
  • Interface: mitmweb (web interface)

Expected behavior

Given the official documentation states:

  • Regexes are Python-style

  • Regexes can be specified as quoted strings

…it is reasonable to expect that unquoted expressions like get(Info|Routers) should behave the same as quoted ones and follow Python regex syntax.

Instead, unquoted patterns containing parentheses and | are parsed as logical expressions, which contradicts both the documentation and user expectations.


Additional context

This causes confusion and unreliable filtering in mitmweb. The user must know to quote expressions containing certain regex syntax, despite the documentation implying that quoting is optional.


Suggested resolution

  • Option 1: Improve the parser to treat unquoted regex expressions (like get(Info|Routers)) as single regex patterns instead of logical expressions.
  • Option 2: Clarify in the documentation that if a regex contains grouping () or alternation |, it must be quoted to avoid being misinterpreted as a logical filter expression.

This would reduce confusion and make filter behavior more predictable and consistent with the documentation.

System Information

Mitmproxy: 11.1.3 binary
Python:    3.13.2
OpenSSL:   OpenSSL 3.4.1 11 Feb 2025
Platform:  Windows-11-10.0.22631-SP0

Checklist

cnrat avatar May 19 '25 01:05 cnrat

refs https://github.com/mitmproxy/mitmproxy/pull/7710

mhils avatar May 19 '25 05:05 mhils