Make error handling for missing keys in objects consistent (lazy vs. eager)
Problem Statement
Some statements always raise an error if an object key is missing:
select obj['x'] from tbl
Others raise an error at runtime, but only if the table isn't empty:
cr> select o['x'] from (select {y=1} o from generate_series(0, 1) t) t1;
ColumnUnknownException[The object `{y=1}` does not contain the key `x`]
cr> select o['x'] from (select {y=1} o from generate_series(0, -1) t) t1;
+--------+
| o['x'] |
+--------+
+--------+
Or:
select obj['x'] from (select obj from tbl) as t
There are probably more such cases.
See also: https://github.com/crate/crate/pull/14593#pullrequestreview-1666602000
Possible Solutions
Make the behavior consistent so that it always raises an error, also on empty tables.
Considered Alternatives
- Keep it inconsistent: Might lead to users missing errors during dev
It seems like while on can define explicit casts for objects including sub-columns this information is dropped:
Object Literal
SELECT pg_typeof({x=1}['x']);
-- undefined
SELECT pg_typeof(({x=1}::OBJECT AS (x LONG))['x']);
-- undefined
Function returning an Object with defined sub-columns
CREATE FUNCTION my_func()
RETURNS OBJECT AS (x LONG)
LANGUAGE JAVASCRIPT
AS $$
function my_func(){
return {x:1};
};$$;
SELECT my_func();
+-----------------+
| {"x"=1::bigint} |
+-----------------+
| {"x": 1} |
+-----------------+
SELECT pg_typeof(my_func()['x']);
+-------------+
| 'undefined' |
+-------------+
| undefined |
+-------------+
A side-effect of this is that something like:
CREATE TABLE test1 AS
SELECT '{"field1":123}'::OBJECT (STRICT) AS (field1 BIGINT) ['field1'];
results on
UnsupportedFeatureException[Type `undefined` does not support storage]
Another case is the definition of object literals in views, such as:
CREATE TABLE data (value_1 BOOLEAN, value_2 INTEGER);
CREATE OR REPLACE VIEW data_view AS
SELECT { value_1 = value_1, value_2 = value_2 } AS o
FROM data;
SELECT column_name, data_type FROM information_schema.columns WHERE table_name = 'data_view';
+-------------+-----------+
| column_name | data_type |
+-------------+-----------+
| o | object |
+-------------+-----------+
o is returned as object (which is correct), but metadata on the subcolumns value_1 and value_2 are missing.
Lookup on object with cast to object(ignored) raises errors:
create table tbl (o object(ignored));
insert into tbl (o) values ({x=10, y=20}), ({z=30});
refresh table tbl;
select o['x'] from tbl; -- works
select o::object(ignored)['x'] from tbl; --raises ColumnUnknownException[The object `{z=30}` does not contain the key `x`]
This is due to o['x'] being a DynamicReference in the first case. In the second case because of the cast it results in a internal subscript(o, 'x'). Currently ObjectType doesn't carry the column policy, so we do not know it is ignored.
ColumnPolicy should probably move from being a property on Reference to being a property on ObjectType - if that doesn't cause other issues (e.g. with object arrays)
(As attempted in https://github.com/crate/crate/pull/14593)