sqlx query's output is different when the same query is run from postgresql cli
Bug Description
When running the query using sqlx, its output is different vs when the same query is run using postgersql CLI. Postgresql server, database are the same.
Minimal Reproduction
let out = sqlx::query!(
"
SELECT
date_trunc('day', '2024-05-09 01:00:00+03'::timestamptz) as foo,
date_trunc('day', '2024-05-09 16:22:19.553755+03'::timestamptz) as bar,
(
date_trunc('day', '2024-05-09 01:00:00+03'::timestamptz) =
date_trunc('day', '2024-05-09 16:22:19.553755+03'::timestamptz)
) as eq
"
)
.fetch_one(&state.db)
.await?;
dbg!(out);
Output:
out = Record {
foo: Some(
2024-05-08 0:00:00.0 +00:00:00,
),
bar: Some(
2024-05-09 0:00:00.0 +00:00:00,
),
eq: Some(
false,
),
}
If the same query is run against the same database on the same postgresql server using psql:
foo | bar | eq
------------------------+------------------------+----
2024-05-09 00:00:00+03 | 2024-05-09 00:00:00+03 | t
(1 row)
Interestingly enough, if AT TIME ZONE 'UTC' is applied to both timestamps before truncating, or UTC argument is supplied to date_trunc function, output will be the same between sqlx and psql.
Info
- SQLx version:
0.7.4 - SQLx features enabled:
postgres,time,uuid,runtime-tokio,rust_decimal - Database server and version:
Postgresql 16.2 - Operating system:
Arch linux 6.8.9-zen1-2-zen -
rustc --version:rustc 1.79.0-nightly (ab5bda1aa 2024-04-08)
This is not strictly a bug. The datetrunc function truncates with respect to the current timezone of the connection: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC
When the input value is of type
timestamp with time zone, the truncation is performed with respect to a particular time zone; for example, truncation to day produces a value that is midnight in that zone. By default, truncation is done with respect to the current TimeZone setting, but the optionaltime_zoneargument can be provided to specify a different time zone.
psql doesn't send a TimeZone parameter by default, so the TimeZone of the connection is the same as that of the server, which appears to be UTC+3 in this case.
Meanwhile, SQLx sets the TimeZone for the connection to UTC by default. This is to enforce consistent behavior across varying configurations, especially when database replicas are distributed across time zones. It's also for consistency with other database drivers, namely the MySQL driver, where we're forced to set UTC as the default timezone or else it's impossible to interpret timestamps correctly in the binary format.
Frankly, the default date_trunc() behavior here is rather insane. I'd imagine you were expecting it to truncate relative to the timezone of the timestamp itself. That's what I'd expect, anyway. As a workaround, you can explicitly specify what timezone should be used for "midnight" as a third parameter:
postgres=# SHOW TIME ZONE;
TimeZone
----------
Etc/UTC
(1 row)
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03') as foo;
foo
------------------------
2024-05-08 00:00:00+00
(1 row)
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03', 'Europe/Helsinki') as foo;
foo
------------------------
2024-05-08 21:00:00+00
(1 row)
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03', 'EEST') as foo;
foo
------------------------
2024-05-08 21:00:00+00
(1 row)
(2024-05-08 21:00:00+00 should be semantically equivalent to 2024-05-09 00:00:00+03.)
Note that trying to pass a numerical offset or a generic timezone name like UTC+3 or +03:00 gives weird results:
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03', 'UTC+3') as foo;
foo
------------------------
2024-05-08 03:00:00+00
(1 row)
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03', 'GMT+3') as foo;
foo
------------------------
2024-05-08 03:00:00+00
(1 row)
postgres=# SELECT date_trunc('day', TIMESTAMP WITH TIME ZONE '2024-05-09 01:00:00+03', '+03:00') as foo;
foo
------------------------
2024-05-08 03:00:00+00
(1 row)
I have no idea why this is, as it's supposed to be supported (UTC+3 should be a valid POSIX time zone specification).
The core takeaway is that timezones are weird. We recommend not handling them in SQL, and just using UTC everywhere.
True, timezones are weird, but I suppose it's a good idea to mention this quirk somewhere in documentation?
Sure, I'd accept a PR for that.