Revisit fix for the dialyzer 'undefined' issue in erlang 19 work
BACKGROUND
Erlang 19 changed how Dialyzer works (EDIT: I'm actually unsure whether it was a Dialyzer change, or an Erlang change), which created Dialyzer errors when upgrading to Erlang 19 and beyond. One of the changes was (from the Erlang reference manual):
Before Erlang/OTP 19, for fields without initial values, the singleton type 'undefined' was added to all declared types. In other words, the following two record declarations had identical effects:
-record(rec, {f1 = 42 :: integer(),
f2 :: float(),
f3 :: 'a' | 'b'}).
-record(rec, {f1 = 42 :: integer(),
f2 :: 'undefined' | float(),
f3 :: 'undefined' | 'a' | 'b'}).
This is no longer the case. If you require 'undefined' in your record field type, you must explicitly add it to the typespec, as in the 2nd example.
In a nutshell, all places in our code where 1) Erlang record declarations did not explicitly specify 'undefined' as being an allowed value in the particular field in question, 2) where a default value was not specified for that field, and 3) a value was not set for the field at record creation (meaning, it then needed a default value, and would use the atom 'undefined'), were now broken as far as Dialyzer was concerned. There were (an off-the-top-of-the-head estimate) maybe well over 100 cases in the code where records like this were used, and where breakages occurred.
The approach I took in correcting the issue was to make the assumption that the writer of the code designed the code so that record fields which were not explicitly set to a default value and which were not explicitly specified as being allowed to be set to 'undefined' were in fact designed to be set to 'undefined' both as a default value, and as a possible value the field could have. And in fact, this has to be the case. Additionally, in cases like this where I'm tasked to fix breakages in a large codebase that I did not author, I generally take a "Hippocratic oath" approach of "do no harm." Here, that basically meant to patch up the code exactly to where the original author had it (or, how it was working), and try not to add or take away anything in the process.
Pursuant to the above, I placed the atom 'undefined' as being allowable for fields in a large number of record declarations in the code. This accomplished several things in my guiding approach. For one thing, 'undefined' was in fact always there anyway, it was simply implicit (invisible magic) vs. explicit. This change simply made it explicit (visible). Secondly, it accomplished my "Hippocratic oath" approach of "do no harm," as all I was doing was placing something there that was, mathematically-speaking, always there anyway, so in essence I was adding nothing nor taking anything away. Thirdly, it made Dialyzer happy, and the breakages were fixed.
THE ISSUE
However, some concerns were raised during code review. Perhaps the "horse's mouth" can communicate the concerns better than I can paraphrase, so here are some excerpts:
mark 11:36 AM
Reading the code for 19; I see that it pretty much mechanically added ‘undefined’ everywhere.
Were there any experiments on what it would take to tighten up those defs and remove unknown?
Lincoln Baker 11:38 AM
good morning
there were no experiments. i assumed the writer of the code wanted to allow for 'undefined' since
that was implicitly there before they changed erlang19
so i assumed 'the fix' was to explicitly state that things could be undefined, vs. implicitly which
would no longer work
i can certainly have a look though. do u have any thoughts on this?
mark 11:43 AM
By and large this kind of thing can be fixed by 1) making record constructors specify all fields and
2) where that fails having a sane default value in the record def.
Lincoln Baker 11:45 AM
that will be a whole lot of record defs and constructors to analyze. i'm sure over 100 places in the
code. i'll give it a scan and more thought after my morning meeting
Lincoln Baker 12:20 PM
looking at the 'undefined' issue now. what are your thoughts on making this a separate PR
('eliminate usage of 'undefined' where practicable and possible') ?
mark 12:36 PM
I’m ok with it being a separate PR, but I really don’t want it getting lost.
Dialyzer finds errors much better when it has tight specs, and the blanket loosening is worrysome.
Lincoln Baker 12:38 PM
no worries, i understand that. i actually don't mind starting work on it right now, but i think the
deadline would hit us if i did
i'll make a PR on it, and try to put it into an upcoming batch of work (same work as figuring out the
state machine)
Lincoln Baker 2:00 PM
do you mind me cutting and pasting some of our conversation here into the issue i'm creating on this?
mark 2:00 PM
Go ahead and copypasta
So let’s do the fixes as a standalone PR. As far as sequencing goes, let prajakta decide on that, but I’m fine with getting the R19 stuff in soonish w/o the fixes, and looping back a little bit later.
Lincoln Baker 2:00 PM
ok, cool. creating the issue now