track lexical scope in compiler
The BQL->SQL compiler is currently unable to ascertain that in
CREATE TABLE t(x, y);
CREATE TABLE u(z, w);
SELECT y, w FROM t, u WHERE x = z
the x and y come from t and the z and w come from u. This has various consequences:
- No helpful error messages from us.
- Names are wrapped in "foo" in SQL output in case they have special characters -- but sqlite3 idiocy reinterprets "foo" as a string, like 'foo', when it doesn't make sense in context as a column name, so if you mistype a column name you get a column of the constant strings of your typo.
- INFER can't automatically tag the relevant column names in nested expressions with IFNULL(x, PREDICT x WITH CONFIDENCE c), so instead we reject nested expressions in INFER (without INFER EXPLICIT).
The compiler should be taught to track lexical environments so it can do all these things and more.
We'd like to see how often this bites people in traces to decide how soon we do this. (If we start building use cases that need this, it might also be accelerated.)
This has bitten us many times already. Most common example is that if you mistype a column name, bayeslite does not tell you -- you simply get a string of the column name instead. It's a serious problem. The only reason I haven't done it yet is that it's a lot of tedious busywork to add another pass to the compiler.
This is not really hard: we already have a bit of a lexical environment in the badly named 'bql_compiler' objects. We should rename those 'environment', write a bit of code to determine from a query or table what columns it will produce, and pass it down into everything.