Rntxt writing (and possibly higher up the workflow)

Open MarkGotham opened this issue 3 years ago • 1 comments

Picking up from https://github.com/MarkGotham/When-in-Rome/pull/47, I think here are two rntxt issues appropriate for reporting here.

Augmentednet uses an explicit /i designation to indicate mixture into major. I quite like the idea and can certainly see how handling cases like VI get easier with this, but we should try to avoid introducing such non-standard syntax. Perhaps a first step is to remove it in cases where it definitely makes no difference anyway (e.g., viio7/i > viio7).
Use of Cad64 encouraged. Augmentednet does output Cad64, but also I64 in cases where Cad64 would seem appropriate. Can you review this corner? If it helps, I'm happy to work through some explicit rules for testing (e.g., Cad64 to V7 but V64 between I and I6) etc.

No doubt I'll be back with more when I get a chance to look more closely ...

Sep 11 '22 13:09 MarkGotham

Thanks for the reports, @MarkGotham!

Yes. It should be easy to postprocess some annotations (like the viio7/i -> viio7 you suggested). There is one caveat, though. The way that the vocabulary of annotations is defined is not entirely how it would look like in a regular rntxt file. For example, there is a reason for viio7/i to be a "tonicization of i" and not viio7/I. I considered the chords that are not diatonic to the key to be nonexistent in that key. Taking the example of the diminished seventh, viiø7/I would be written as viiø7; but viio7/I would be written as viio7/i, because it "doesn't exist" in I. The rationale for this is to force the tonicization finder of the model to think that we are deviating briefly to the parallel key. Currently, the annotations respect the idiosyncrasies of the underlying model. Of course, these can be overwritten for an easier presentation in the rntxt file, but I am still debating internally whether it is best to faithfully represent the tonal representation of the model, or the annotations that would be more idiomatic to read. This is the kind of thing that needs a beer and a discussion :).
It does output Cad64, but as you noticed, the number of Cad64 annotations output by the model is extremely low. In fact, it only outputs ~6% of the Cad64 in the test set. I attribute this to the fact that several datasets do not encode Cad64 chords, my haydn annotations, for example. I guess that confuses the model and the performance of this special label is very low at the moment. It is possible, as you mention, to write some rules to write Cad64 explicitly based on the context. I prefer to limit the amount of tampering I do to the model's predictions to the minimum, but I understand the value of those rules. Do you think it would be possible to implement this post-processing as an external module to AugmentedNet? For example, given the original rntxt provided by the model and the score, determine additional chords that should change I64 -> Cad64. The real solution to this problem in the long run is, we need more examples of Cad64 in the training set(s). Then the neural network will do a better job. Also, I'm happy to report that no other state-of-the-art model outputs or considers Cad64 chords in their vocabulary. Thus, 6% is still better than 0% :). Hopefully, other models will adopt this and the field will move forward.

Sep 11 '22 23:09 napulen