OpenAPI-Specification icon indicating copy to clipboard operation
OpenAPI-Specification copied to clipboard

Behaviour of pattern (regex) with url encoding

Open atward opened this issue 6 years ago • 1 comments

Q: Is pattern regex checking applied before or after urlencoding (rfc3986)?

The OpenAPI spec allows pattern for properties which uses JSON Schema validation. When applying the JSON Schema logic to a JSON data structure this validation works fine. The spec also provides ability for allowReserved to permit rfc3986 2.2 reserved characters. This logic becomes confusing however when the parameter is in the query and needs to be url encoded.

My question is: does pattern need to list all possible raw user inputs (ie: unreserved & percent-encoded characters)? eg: is [0-4] considered the same as ([0-4]|%30|%31|%32|%33|%34)

Example: Take the following spec which provides /search?name=My Name:

paths:
  /search:
    get:
      parameters:
        - name: name
          in: query
          required: true
          schema:
            type: string
            pattern: '^[A-Za-z]+\s[A-Za-z]+$'

Here the spec expects a whitespace delimited full name. As the parameter is a GET query parameter ' ' (space) will be encoded as %20, thus GET /search?name=My%20Name. ECMA regular expression fails to match My%20Name, but matches My Name after urldecoding.

Can someone please clarify the behaviour. How should the documentation be updated to articulate this behaviour?

atward avatar Oct 28 '19 00:10 atward

From a purely JSON Schema perspective, JSON Schema is defined over a data model, meaning it works on the parsed data. So not JSON or YAML text, and not encoded URLs. OpenAPI defines a way to map parameters to JSON Schemas, so I would guess that it is the parsed values to which the JSON Schema applies.

I am not an expert on the OAS parameter rules, though, so take this with a grain of salt until someone more qualified can answer :-)

handrews avatar Jan 21 '20 01:01 handrews

Schema validation definitely happens prior to Parameter/Encoding-Object-driven serialization. As noted in PR #3840, an API can require some sort of pre-serialization to have more control over the string representation, but schema validation is the first step governed directly by this specification in the serialization process.

handrews avatar May 23 '24 00:05 handrews

PR merged for 3.0.4 and ported to 3.1.1 via PR #3921! This is addressed by the new Appendix B.

handrews avatar Jun 21 '24 14:06 handrews