Support database linters
Our check command is currently only looking at files in the filesystem. To properly guard the user against issues, we also need to run checks directly against the database. There are a few tools out there that do exactly that, further validating the use-case:
- https://github.com/supabase/splinter: it comes as a single query. not a database extension. we copy the query and run it ourselves. even supabase just copies it. Although I would prefer to not manage it ourselves, especially since supabase will probably allow users to enable / disable individual rules. maybe they could add it as an extension too, and expose a simple rpc-based api? since the rules are geared towards supabase, we could at least check if its a supabase project via the existence of certain schemas and only run it then.
- https://github.com/pmpetit/pglinter: this one is basically the same as
splinter, but distributed as a postgres extension. we could detect its existence.
what I am not sure about yet is whether we should integrate it into our existing check command. we could introduce a new location notion for diagnostics that points to a schema object and simply append it to the output from the file-based checks.
public.mytable.test_id splinter/performance ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
⚠ Identifies foreign key constraints without a covering index, which can impact database performance.
ℹ Table public.mytable has a foreign key test_id without a covering index. This can lead to suboptimal query performance.
Whether or not db checks should be includes could be controlled by a flag, e.g. no-schema-lints.
or we add a new command, e.g. check-db to distinguish. this would have improve clarity and we would not mix up cli flags that target files (e.g. —since) with the ones that run directly against the db (potentially —schema or similar filters).
In both cases, I would prefer to not bring the actual rules into this project, but detect their existence in the database and expose it to the user. the goal is an unified lint interface and a better dx (one CI lint to rule them all). We also should not allow configuring the respective rules directly if not necessary, and rather ask the user to make the settings directly via the api the tool itself provides (e.g. SELECT pglinter.disable_rule('B001');).
we could provide api-level settings though (next to disabled: bool). pglinter has functions to just check specific categories, so we could allow the user to choose them in our config too. this could lead to confusion though, since a user could achieve exactly the same by disabling individual rules from the different categories.
-- Quick comprehensive check (output to prompt)
SELECT pglinter.check_all();
-- Individual category checks (output to prompt)
SELECT pglinter.check_base();
SELECT pglinter.check_cluster();
SELECT pglinter.check_table();
SELECT pglinter.check_schema();
I have a couple of thoughts:
- The
checkcommand was designed to lint a singl sql file, so I think your intuition to introduce another command that "lints the db" is good - Checking whether we have
pglinterinstalled or whether it's a supabase project before trying to run the lints is a great idea - I think it would be confusing if the
pglinterrules aren't configurable in our.jsoncfile. I'm probably imagining this as too easy, but if we have apglintermodule that outputs ourDiagnostictype, couldn't we just mutate the matching diagnostics depending on the users settings ("off", "warn", …)? We could show a warning to the user about any setting that has been disabled on thepglinterlevel but is enabled in our.jsonc. - for the
splinter.sqlquery, we could use ajustcommand that pulls a fresh copy from the public repo and copies it to a file in the lsp before building. The output types of the query seem to share the same properties,name, title, leveletc., so we could just use a sqlx type and map from there? It should be easy to use ignore rules by mutating the sqlx type, wdyt?
In addition:
- many of the supabase queries (e.g.
auth.users exposed,rls disabled in public) could be easily added to our existing logic by using theSchemaCache, and it would be helpful to show theses as warnings right in the SQL file (if the project is a supabase project)
- I think it would be confusing if the
pglinterrules aren't configurable in our.jsoncfile. I'm probably imagining this as too easy, but if we have apglintermodule that outputs ourDiagnostictype, couldn't we just mutate the matching diagnostics depending on the users settings ("off", "warn", …)? We could show a warning to the user about any setting that has been disabled on thepglinterlevel but is enabled in our.jsonc.
I am more worried about the user experience here. I think we have two user groups:
- people who already know and use
pglinterand want to unify the interface - people who just want to lint their database and they do not care about the tool
I can imagine that someone will configure the rules via the api that pglinter provides (e.g. a yaml file that is imported). If we allow configuration via our config file, then we might get problems:
- we need to write to the users database, something that we do not do yet. I would prefer that the language sever only needs read access to a users database
- even if we just filter afterwards, how do we handle conflicts such as 'the rule is disabled in the database, but enabled in our config'
I agree that it would be nice for user group 2, but I would rather point them to the pglinter docs instead of trying to integrate something we do not fully control.
The check command was designed to lint a singl sql file, so I think your intuition to introduce another command that "lints the db" is good
mhh its rather designed as a check everything command, but geared towards the file system right now. I am still not sure about this. it would certainly create some level of confusion either way. but dblint as its own command allows more flexibility in the api design.
for the splinter.sql query, we could use a just command that pulls a fresh copy from the public repo and copies it to a file in the lsp before building. The output types of the query seem to share the same properties, name, title, level etc., so we could just use a sqlx type and map from there? It should be easy to use ignore rules by mutating the sqlx type, wdyt?
yes, all the concerns above are not valid (yet) for splinter since it does not have database state. I can imagine supabase will add that though, I will ask the team. but we could allow configuration within our config file for each of their rules, and just pull in the latest query every night via an action.
many of the supabase queries (e.g. auth.users exposed, rls disabled in public) could be easily added to our existing logic by using the SchemaCache, and it would be helpful to show theses as warnings right in the SQL file (if the project is a supabase project)
yes, but many of the lints will not have a source in a sql file. we definitely need both. thinking about it, this is also a good reason to have separate commands. this way, we will not report any violation twice. and autofix will also be weird if its the same command since we cannot autofix diagnostics coming from the database.
yes, but many of the lints will not have a source in a sql file. we definitely need both. thinking about it, this is also a good reason to have separate commands. this way, we will not report any violation twice. and autofix will also be weird if its the same command since we cannot autofix diagnostics coming from the database.
yes, I meant it would be nice to have both 👍🏻
even if we just filter afterwards, how do we handle conflicts such as 'the rule is disabled in the database, but enabled in our config
pglinter has some ways to check whether rules are enabled, a crude one being:
SELECT pglinter.is_rule_enabled('B001');
We could run this once and report to the user if there is a rule disabled on the pglinter level and it doesn't match our config?
we need to write to the users database, something that we do not do yet. I would prefer that the language sever only needs read access to a users database
100% agree, I think the only time write access is okay is when the lsp is connected to a local instance, such as with the "execute statement under cursor" feature!
We could run this once and report to the user if there is a rule disabled on the pglinter level and it doesn't match our config?
thats a good idea!
okay, then lets go with dblint as its own command, and allow the user to configure the rules on our side. if there is a configuration mismatch, we will report it as a warning.