plpgsql_check icon indicating copy to clipboard operation
plpgsql_check copied to clipboard

Feature Request: lint inserts without values for non-nullable columns

Open tmcgann-neotax opened this issue 9 months ago • 5 comments

Context: I have a stored procedure that performs an insert on a table. I added two new non-nullable columns to that table. I ran the plpgsql_check after adding those columns and it didn't yield any error output.

Problem: When I execute the stored procedure it errors out because I'm not inserting values into the new non-nullable columns.

Request: Would it be possible for plpgsql_check to catch and report inserts without values for non-nullable columns?

tmcgann-neotax avatar Apr 01 '25 22:04 tmcgann-neotax

st 2. 4. 2025 v 0:47 odesílatel Taylor McGann @.***> napsal:

Context: I have a stored procedure that performs an insert on a table. I added two new non-nullable columns to that table. I ran the linter after adding those columns and it didn't yield any error output.

Problem: When I execute the stored procedure it errors out because I'm not inserting values into the new non-nullable columns.

Request: Would it be possible for plpgsql_check to catch and report inserts without values for non-nullable columns?

— Reply to this email directly, view it on GitHub https://github.com/okbob/plpgsql_check/issues/185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFO46JQABSRJSVOIPJS4T2XMJQHAVCNFSM6AAAAAB2ICMDTSVHI2DSMVQWIX3LMV43ASLTON2WKOZSHE3DINZSGQYTGMA . You are receiving this because you are subscribed to this thread.Message ID: @.***> [image: tmcgann-neotax]tmcgann-neotax created an issue (okbob/plpgsql_check#185) https://github.com/okbob/plpgsql_check/issues/185

Context: I have a stored procedure that performs an insert on a table. I added two new non-nullable columns to that table. I ran the linter after adding those columns and it didn't yield any error output.

Problem: When I execute the stored procedure it errors out because I'm not inserting values into the new non-nullable columns.

Request: Would it be possible for plpgsql_check to catch and report inserts without values for non-nullable columns?

I think this check can be implemented. I am not sure how much work it needs

  • the format of metadata was changed in last versions.

— Reply to this email directly, view it on GitHub https://github.com/okbob/plpgsql_check/issues/185, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFO46JQABSRJSVOIPJS4T2XMJQHAVCNFSM6AAAAAB2ICMDTSVHI2DSMVQWIX3LMV43ASLTON2WKOZSHE3DINZSGQYTGMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

okbob avatar Apr 02 '25 05:04 okbob

@okbob I've never coded in C but could take a look. Not really sure where to begin. Could you point me in the right direction? How long would it take roughly to add a feature like this: hours, days, weeks?

tmcgann-neotax avatar Apr 02 '25 17:04 tmcgann-neotax

Hi

st 2. 4. 2025 v 19:01 odesílatel Taylor McGann @.***> napsal:

@okbob https://github.com/okbob I've never coded in C but could take a look. Not really sure where to begin. Could you point me in the right direction? How long would it take roughly to add a feature like this: hours, days, weeks?

— Reply to this email directly, view it on GitHub https://github.com/okbob/plpgsql_check/issues/185#issuecomment-2773195975, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFO44EEB3ZRQLY3DYIJJL2XQJXVAVCNFSM6AAAAAB2ICMDTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZTGE4TKOJXGU . You are receiving this because you were mentioned.Message ID: @.***> [image: tmcgann-neotax]tmcgann-neotax left a comment (okbob/plpgsql_check#185) https://github.com/okbob/plpgsql_check/issues/185#issuecomment-2773195975

@okbob https://github.com/okbob I've never coded in C but could take a look. Not really sure where to begin. Could you point me in the right direction? How long would it take roughly to add a feature like this: hours, days, weeks?

I think you should iterate over "compiled" INSERT statements and expressions and try to search a NULL constant and compare it with NOT NULL check in table definition.

For me, it is probably a few days, maybe week - for somebody who never wrote an extension to Postgres it can be weeks. C language is easy. But it is harder to learn some PostgreSQL internals. In this case - the code related to query and expression execution. Fortunately this is, probably, the most simple and intuitive part of Postgres (locking, transaction support, storage are harder parts of Postgres). At the end - this patch can have less than five hundred lines

It can be a very good school exercise - and a good introduction to writing PostgreSQL extensions.

But it is not trivial - the best start point is reading everything about postgresql extensions

read https://www.postgresql.org/docs/current/xfunc-c.html

Try to study some simple extensions like https://github.com/orafce/orafce and then plpgsql_check.

The SQL is translated to plans - internally plan is tree of nodes, and some nodes are subtrees related to expressions. PLpgSQL function is tree of symbols - well known like AST.

PLpgSQL_check is specific extension, because it is able to iterate over plpgsql trees, over plan trees and over expression's trees - and there it tries to find some interesting patterns.

Probably there is nothing similar extension to plpgsql_check. But without this specific feature - it is usual PostgreSQL extension - it uses build and deploy system, memory handling, data types, etc

Maybe good exercise can be writing own language to Postgres https://github.com/postgres/postgres/tree/master/src/test/modules/plsample

So

  1. try to write some simple extension to Postgres

  2. try to hack plpgsql language - try to implement repeat - until cycle

  3. look to plpgsql_check code and try to study how the check of format of "format" function is implemented or you can look at how plpgsql_check tries to identify unwanted casts (that can block index usage).

  4. after that - it will be simple to write requeted check

Surely - it is not easy, but there can be interesting benefits - you will be able to write and hack Postgres. Maybe less than a few thousand people are able to do it on the planet. Usually it is related to a much better understanding of Postgres than is usual. And better knowledge of Postgres means a better possibility to find a good job today. Anyway - Postgres has very good code, and writing extensions is funny. When I started to write extensions (twenty years ago) and started to hack Postgres, then I stopped playing computer games, because writing to Postgres was much more funny (and more creative). And it was a pretty good school for me. I was an average commercial developer - and there was no possibility to touch large and good projects in my country. Postgres is a very old school project in old language - but it is relatively modern driven, and code base is pretty well (probably the best code in C language in the planet). So try it, and you will see.

— Reply to this email directly, view it on GitHub https://github.com/okbob/plpgsql_check/issues/185#issuecomment-2773195975, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEFO44EEB3ZRQLY3DYIJJL2XQJXVAVCNFSM6AAAAAB2ICMDTSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDONZTGE4TKOJXGU . You are receiving this because you were mentioned.Message ID: @.***>

okbob avatar Apr 02 '25 18:04 okbob

It should be calculated with existence of before triggers on the table. The missing value can be filled there. So when the table has a before trigger, then there is a risk of false alarms.

I did some quick research and from query nodes is clean what is target relation:

CREATE TABLE foo(a int, b int NOT NULL, c int);
INSERT INTO foo(a) VALUES(10);
DETAIL:     {QUERY 
   :commandType 3 
   :querySource 0 
   :canSetTag true 
   :utilityStmt  
   :resultRelation 1 
   :resultVariable 0 
   :hasAggs false 
   :hasWindowFuncs false 
   :hasTargetSRFs false 
   :hasSubLinks false 
   :hasDistinctOn false 
   :hasRecursive false 
   :hasModifyingCTE false 
   :hasForUpdate false 
   :hasRowSecurity false 
   :hasGroupRTE false 
   :hasSessionVariables false 
   :isReturn false 
   :cteList  
   :rtable (
      {RANGETBLENTRY 
      :alias  
      :eref 
         {ALIAS 
         :aliasname foo 
         :colnames ("a" "b" "c")
         }
      :rtekind 0 
      :relid 16414 
      :inh false 
      :relkind r 
      :rellockmode 3 
      :perminfoindex 1 
      :tablesample  
      :lateral false 
      :inFromCl false 
      :securityQuals 
      }
   )
   :rteperminfos (
      {RTEPERMISSIONINFO 
      :relid 16414 
      :inh false 
      :requiredPerms 1 
      :checkAsUser 0 
      :selectedCols (b)
      :insertedCols (b 8)
      :updatedCols (b)
      }
   )
   :jointree 
      {FROMEXPR 
      :fromlist  
      :quals 
      }
   :mergeActionList  
   :mergeTargetRelation 0 
   :mergeJoinCondition  
   :targetList (
      {TARGETENTRY 
      :expr 
         {CONST 
         :consttype 23 
         :consttypmod -1 
         :constcollid 0 
         :constlen 4 
         :constbyval true 
         :constisnull false 
         :location 26 
         :constvalue 4 [ 10 0 0 0 0 0 0 0 ]
         }
      :resno 1 
      :resname a 
      :ressortgroupref 0 
      :resorigtbl 0 
      :resorigcol 0 
      :resjunk false
      }
   )
   :override 0 
   :onConflict  
   :returningOldAlias  
   :returningNewAlias  
   :returningList  
   :groupClause  
   :groupDistinct false 
   :groupingSets  
   :havingQual  
   :windowClause  
   :distinctClause  
   :sortClause  
   :limitOffset  
   :limitCount  
   :limitOption 0 
   :rowMarks  
   :setOperations  
   :constraintDeps  
   :withCheckOptions  
   :stmt_location 0 
   :stmt_len 0
   }

and plan nodes contains the default values (that can be checked against NOT NULL constrainst)

DETAIL:     {MODIFYTABLE 
   :plan.disabled_nodes 0 
   :plan.startup_cost 0 
   :plan.total_cost 0.01 
   :plan.plan_rows 0 
   :plan.plan_width 0 
   :plan.parallel_aware false 
   :plan.parallel_safe false 
   :plan.async_capable false 
   :plan.plan_node_id 0 
   :plan.targetlist  
   :plan.qual  
   :plan.lefttree 
      {RESULT 
      :plan.disabled_nodes 0 
      :plan.startup_cost 0 
      :plan.total_cost 0.01 
      :plan.plan_rows 1 
      :plan.plan_width 12 
      :plan.parallel_aware false 
      :plan.parallel_safe false 
      :plan.async_capable false 
      :plan.plan_node_id 1 
      :plan.targetlist (
         {TARGETENTRY 
         :expr 
            {CONST 
            :consttype 23 
            :consttypmod -1 
            :constcollid 0 
            :constlen 4 
            :constbyval true 
            :constisnull false 
            :location 26 
            :constvalue 4 [ 10 0 0 0 0 0 0 0 ]
            }
         :resno 1 
         :resname a 
         :ressortgroupref 0 
         :resorigtbl 0 
         :resorigcol 0 
         :resjunk false
         }
         {TARGETENTRY 
         :expr 
            {CONST 
            :consttype 23 
            :consttypmod -1 
            :constcollid 0 
            :constlen 4 
            :constbyval true 
            :constisnull true 
            :location -1 
            :constvalue 
            }
         :resno 2 
         :resname b 
         :ressortgroupref 0 
         :resorigtbl 0 
         :resorigcol 0 
         :resjunk false
         }
         {TARGETENTRY 
         :expr 
            {CONST 
            :consttype 23 
            :consttypmod -1 
            :constcollid 0 
            :constlen 4 
            :constbyval true 
            :constisnull true 
            :location -1 
            :constvalue 
            }
         :resno 3 
         :resname c 
         :ressortgroupref 0 
         :resorigtbl 0 
         :resorigcol 0 
         :resjunk false
         }
      )
      :plan.qual  
      :plan.lefttree  
      :plan.righttree  
      :plan.initPlan  
      :plan.extParam (b)
      :plan.allParam (b)
      :resconstantqual 
      }
   :plan.righttree  
   :plan.initPlan  
   :plan.extParam (b)
   :plan.allParam (b)
   :operation 3 
   :canSetTag true 
   :nominalRelation 1 
   :rootRelation 0 
   :partColsUpdated false 
   :resultRelations (i 1)
   :updateColnosLists  
   :withCheckOptionLists  
   :returningOldAlias  
   :returningNewAlias  
   :returningLists  
   :fdwPrivLists ()
   :fdwDirectModifyPlans (b)
   :rowMarks  
   :epqParam 0 
   :onConflictAction 0 
   :arbiterIndexes  
   :onConflictSet  
   :onConflictCols  
   :onConflictWhere  
   :exclRelRTI 0 
   :exclRelTlist  
   :mergeActionLists  
   :mergeJoinConditions 
   }

okbob avatar Apr 03 '25 08:04 okbob

Don't afraid to ask any question.

okbob avatar Apr 03 '25 09:04 okbob

Now, I think so this implementation is not simple. There can be more constraint just NOT NULL, and can be little bit messy, so only NOT NULL check will be implemented. More the metadata for NOT NULL constraint was changed in PostgreSQL 18, the check should be implemented 2x.

okbob avatar Jul 18 '25 18:07 okbob