Plus sign in timezone designator
This PR adds some tests for a timezone with positive UTC offset. It also makes a real code change, by relaxing the ISO datetime regex to allow a space in addition to the + or - already allowed to specify UTC offset.
This resolves an issue where an ISO datetime passed as application/x-www-form-urlencoded has its + replaced by a space (see this SO issue).
That is: 2018-03-03T00:00:00.000000+00:50 sent as a url parameter can be received as 2018-03-03T00:00:00.000000 00:50.
From a quick check, I don't think a space instead of a + is valid ISO8601, so I'd rather not merge this.
The SO issue you link to seems to be related to URL encoding. Arguments should be URL encoded.
No problem, and thanks for checking, I should have discussed this first. Indeed this is a URL encoding issue, and a suggested fix that deviates from the ISO8601 specs. Sometimes it pays to be flexible, for example, Django accepting a minus sign in ISO8601 durations, but this is probably not the right hammer here.
For the record, I ran into this when validating the url search parameter ?start=2021-04-01T00:00:00+02:00, using use_kwargs from webargs.flaskparser.
?start=2021-04-01T00:00:00+02:00
Does this fail validation? It doesn't have a space.
It fails validation only because the + in the url search parameter is replaced by a space before it reaches Marshmallow validation. In the test, I mimicked this behaviour by using parse.unquote_plus.
Here's a simple example (which I probably should have used to open up an issue instead of immediately making a PR):
from datetime import datetime
from flask import Flask
from marshmallow import fields
from webargs.flaskparser import use_kwargs
app = Flask(__name__)
@app.route('/')
@use_kwargs(
{
"start": fields.AwareDateTime(format="iso", required=True),
},
location="query",
)
def hello_world(start: datetime):
return f'Hello World. The time is {start}.'
if __name__ == '__main__':
app.run()
- Navigating to http://localhost:5000/?start=2021-04-01T00:00:00-02:00 succeeds (negative offsets succeed in general)
- Navigating to http://localhost:5000/?start=2021-04-01T00:00:00+02:00 fails (positive offsets fail in general)
If you confirm the issue, should I take it up elsewhere (any hint as to where?), or would you consider a (this) fix in Marshmallow?
Should I open an Issue instead? Or can I otherwise provide more clarity on this problem?
The problem is clear. I understand the use case.
We could argue that serialization should be strict by default. But I think there are other places in the code where it is more permissive by default, so this argument is week.
I'd like to move to standard lib fromisoformat (see #2078). And this will reject datetimes with spaces (I just tested), so accepting them now means we'd be introducing a breaking change in the long run or we'd have to add a fix, which I'd rather avoid.
Ultimately, parameters should be URL-encoded. The user is free to customize the field to fix the encoding so that the fix will still stand when/if we use fromisoformat.
I'm not a regex expert but it should be possible to do the space -> plus replacement in a safe way before feeding the input to the deserialization function.
Good idea to rely on the standard library.
Afaik the parameter is actually correctly url-encoded when it reaches validation, in accordance with RFC 1866 section 8.2.1 subparagraph 1.
The space -> plus replacement you mention is precisely what I did to overcome this issue last year here:
value = value.replace(" ", "+")
This works because there should be no space in a valid ISO 8601 timestamp.
However, the RFC 3339 profile does allow a space to be used instead of the T, as does datetime.fromisoformat().
To avoid replacing the wrong space, we could use the following regex instead:
value = re.sub(r"(:\d{2})\s", r"\1+", value)
Whether this is needed depends on whether Marshmallow uses _iso8601_datetime_re.match(value) or value.fromisoformat() (after https://github.com/marshmallow-code/marshmallow/pull/2078). The former allows a space, but not a plus, as a substitute for T (so the re.sub would be needed), while the latter allows both the space and the plus as a substitute for T (and the more legible .replace() would work, too).
The substitution could happen in marshmallow.utils.from_iso_datetime just before the call to _iso8601_datetime_re.match(value) (currently) or value.fromisoformat() (after https://github.com/marshmallow-code/marshmallow/pull/2078).
Of course, I'd be happy to amend my PR.