sql-surveyor icon indicating copy to clipboard operation
sql-surveyor copied to clipboard

More informative column alias?

Open mmkal opened this issue 5 years ago • 6 comments

Hi - love the library! I don't know if this is an issue per-se, but I got some output I wasn't expecting.

Given this is a "high-level" SQL parser, I was hoping to use it to tell me which columns of which tables queries referred to. It gets me closer, but further parsing is still needed. Example:

let sqlSurveyor = require("sql-surveyor")

let surveyor = new sqlSurveyor.SQLSurveyor(sqlSurveyor.SQLDialect.PLpgSQL)
let columns = surveyor.survey('select f.id from foo as f').getQueryAtLocation(0).outputColumns

console.log(columns)

This prints:

[
  OutputColumn {
    columnName: 'f.id',
    columnAlias: null,
    tableName: 'f',
    tableAlias: null
  }
]

Whereas I'd expect sql-surveyor to do the fiddly bits for me and figure out that f is an alias for the actual table foo, and that the implicit column alias (i.e. the column name that I'll get back from running this query on most clients) is id, so an output of

[
  OutputColumn {
    columnName: 'f.id',
    columnAlias: 'id',
    tableName: 'foo',
    tableAlias: 'f'
  }
]

I can figure that stuff out myself, by looking up 'f' in referencedTables, and parsing 'f.id' by splitting on '.', but I was hoping sql-surveyor would do some of that for me. Just wanted to check a) it isn't already done in a feature I haven't come across yet or b) whether you'd be open to adding that functionality.

mmkal avatar Dec 19 '20 19:12 mmkal

Some more notes:

On looking into how to use referencedTables to get the information I needed, there might be opportunity to make that more convenient too. This is what it looks like for the query above:

{
  f: ReferencedTable {
    tableName: 'f',
    schemaName: null,
    databaseName: null,
    aliases: Set {},
    locations: Set {
      TokenLocation {
        lineStart: 1,
        lineEnd: 1,
        startIndex: 7,
        stopIndex: 7
      }
    }
  },
  foo: ReferencedTable {
    tableName: 'foo',
    schemaName: null,
    databaseName: null,
    aliases: Set { 'as f' },
    locations: Set {
      TokenLocation {
        lineStart: 1,
        lineEnd: 1,
        startIndex: 17,
        stopIndex: 19
      }
    }
  }
}

Since the tableName from the column is f, I need to iterate through the entries in referencedTables, then iterate through each of their aliases to find one that looks like as ${column.tableName}, which might be a bit brittle.

I also noticed that the behaviour for sqlSurveyor.SQLDialect.MYSQL is closer to what I expected, so maybe this is a bug after all?

mmkal avatar Dec 19 '20 19:12 mmkal

Thanks for the detailed issue! As you figured out in #2, PL/pgSQL table aliases aren't being parsed correctly.

I want to address your desired output of:

  OutputColumn {
    columnName: 'f.id',
    columnAlias: 'id',
    tableName: 'foo',
    tableAlias: 'f'
  }

I'd like to keep the columnAlias attribute preserved for when there is an explicit alias used. Would a getDisplayName() method on the OutputColumn class work for your use case?

columnName should perhaps be renamed, since it can contain whatever is used in the SELECT list, but I'm not sure what to call it at the moment.

mtriff avatar Dec 19 '20 21:12 mtriff

Yes I think getDisplayName() would work great.

One other thing that'd be useful to capture (or maybe necessary for getDisplayName() to be correct) would be any type casts. e.g. select oid, typname::regtype from pg_type or select '{}'::json.

mmkal avatar Dec 19 '20 23:12 mmkal

Sounds good for getDisplayName(). I'm not sure when I'll have a chance to implement it, but it's a good idea.

How would you use the type/cast information? It would be pretty sparsely populated (assuming casts are pretty rare), since we can't really infer type information unless it's explicitly used.

mtriff avatar Dec 21 '20 22:12 mtriff

I'm working on a type-generator for postgres, so I could conceivably use the cast target itself, but either way it'd be important for the select oid, typname::regtype from pg_type example to return typname from getDisplayName() rather than typname::regtype since it's the former that'd be returned from clients.

mmkal avatar Dec 21 '20 23:12 mmkal

Good to know. Agreed, it will need to be tracked to get the display name anyway, so it shouldn't take too much effort to include it as an attribute.

mtriff avatar Dec 22 '20 03:12 mtriff