#88 Support the addition of new fields to an existing schema.
- Especially ListFields and DictFields
I'm just not sure of the behavior where the data set in the Mapping object is updated into the original to-be-wrapped object. Can you explain to me why that makes sense? Conceptually, it seems like wrapping should be a one-way street: the new object gets data from the mapping that's passed in.
(Also, it looks like your changes made the tests fail. Can you look into that?)
OK so I have been writing loads of stuff to explain to you, and in the process I think I have got to the real crux of the problem.
The architecture handles defaults in two different ways.
- Virtual default not written to the db
- Static default which is written to the db
When a document is initialized, then all the defaults are hard written into the document, so you can see them in document._data, and those defaults will be written to the db when the document is stored.
When a document is loaded from the database, then any missing defaults from the document are not hard written into the document. (My suggested fix changes that)
It also handles defaults in Field.__get__, if the field is not in the doc, then it just returns the default.
It seems to be trying to give you a default without writing it to the document._data. This is a different kind of default behavior, where it just gives the caller the default value, but the default value never get's written to the DB.
This is in principle a nicer way to handle defaults, because you can just change the default in code if you want, and everything which has not got a value set will just work with the new default.
Unfortunately it is hard to make that kind of defaulting work with this architecture.
It means you only ever change instance._data on __set__, which is fine until you have submappings like a DictField. Setting something in a DictField mapping, will only change it's _data, and if the DictField is not in it's parents _data (because it was just defaults before) then you loose the change. So you would need to find a way to ripple the set up so that the DictField mappings _data gets attached to it's parents _data once it is not just defaults anymore.
That goes for Lists as well, you need to find a way to ripple changes made through the ListProxy, to update the parent fields _data.
I had a go at bubbling up the sets which does work, but is a horrible hack, but it does prove that an entirely "virtual" default behavior is possible,
So I recommend removing the default handling in __get__, and just set the defaults as we do on initializing of the Mapping, and as we do now in the wraps.
Incidentally there is a bug in Field.__get__ anyway, it should return self._to_python(default). So that you get the ListProxy or MappingProxy, instead of a real [] or {}.
I am going to go have a look at the tests see what's going on. I didn't realise we have a mix of doc tests and stand alone tests.
To answer about your question about wrap itself. If you want all new schema defaults to be added in initialization in the same as as they are when we make a new document. Then it needs to add something. eg
- I have a hand with 4 fingers
- I wear a five fingered glove
- I want to use the fifth finger
- So the glove attaches the fifth finger to my hand
I know understand the bug in
Field.__get__
because it was overloading get to
- Get the python object for a field
- Also get the json values when it needed the default.
So it looks like the "virtual" default style was never really intended, it just looked like it because it would return you something, even though it was the wrong type. You mentioned that in the initial issue you were getting [] instead of a ListProxy.
This was a side effect of a strange implementation to get the default values on initializing a mapping.
I have fixed the strange implementation to something that doesn't have weird side effects in the branch too.
OK so even though the tests pass, I came across another case where this solution doesn't work.
In particular, if you have a ListField(DictField(Mapping.build(......schema....))), then change the schema. Any instances of the old DictField in the ListField were not being updated with a new schema.
So I went back to the old way of getting the defaults set. It means some changes to the date and time field _to_json methods. If the data comes from user code, we expect the field to be in a date,datetime,time python type. But if it comes from wraps, then it is not in the correct type, because they are just strings. So just make sure they are the right type in _to_json before converting them. The other basic fields are fine, because bool(bool(True)) doesn't fail for example.
Now the tests pass, using the original solution.
Wow, that's incredibly thorough! Thank you for really diving into this. In part, this is generating some more questions for me, so I hope you'll bear with me.
- How is this class method actually being used? To me, it looks like being used in
Document.load()andDocument._wrap_row(), which seems sensible enough, and the_to_pythonmethods. - As for what you call virtual and static (which I guess you could also call lazy and eager, maybe?), so it seems you're saying that the code was designed to be more static, materializing defaults eagerly? But at some point maybe some code was added which tries to be virtual?
- It kind of makes sense to me that we materialize defaults eagerly on document creation, but that we try to be lazy if we're loading something from the database: it feels bad to mutate the retrieved document kind of implicitly. On the other hand, we still have similar problems where the defaults could shift out from under us as they are changed in the code (in a way, this could still happen if the materialized defaults don't get saved, but there's nothing we can do about that, I guess).
Does this make sense to you? Does it match what your changes do?
As for the actual commits, I'd like to have a clean commit series with small commits moving the behavior to what it should be. Also, having extensive commit messages is great, but please keep the first line of the commit message at about ~70 characters, then add an empty line, then continue the story.
I hope you still have time/energy to work on this!
So yes you have understood pretty well what's going on. Except my changes have not implemented lazy defaults if we're loading something from the database. In both cases when loading and when creating anew document defaults are handled eagerly.
It seems bad to handle defaults in an inconsistent way. I think they should either always be lazy or always eager. I don't really understand the difference between loading a document {some document} from the db, and initializing {some document} from code. Why is it OK for one of them to implicitly change the document, and one not? If you don't like implicit changes then best to go with the lazy defaults.
When I was considering the pros and cons between them I think that the eager solution is better. This is because there are cases when you want to change a default only for new records, not for old one's. Which is almost impossible to do with a lazy default system. However with a eager default system, if you change the default, it's not that hard to update old records if you do want to change the defaults which have been written to old documents. So the eager defaults gives more flexibility.
Also: It's considerably easier to implement eager defaults. I did come up with a possible way of implementing lazy defaults, but it is a horrible hack.
Okay, I agree that the eager behavior is better, I was just working through the pros and cons before.
Could you clean up the commits a bit along the lines that I suggested? If not, I can do that.
Yes I can clean it up. Just didn't want to do it before we had agreed the behavior you want. I will do it sometime this week if that's OK.
Sounds great!
Hey, did you get a chance to work on this? If not, I'll just clean it up. :)
Sorry I was really busy last week.
Hey @JonathanWylie, do you think you'll have time in the near future to clean this up a little? If not, that's fine too, just let me know and I'll try to dig into it myself.