drf-writable-nested icon indicating copy to clipboard operation
drf-writable-nested copied to clipboard

Handle get_or_create when models have unique fields

Open oliver-zhou opened this issue 8 years ago • 11 comments

When trying to do a nested create - there should be the option to "get_or_create" identical objects that already exist.

oliver-zhou avatar Jun 24 '17 01:06 oliver-zhou

Hi @oliver-zhou Thank you for contributing! Now object updates if PK is specified. PK can be any custom field #10 It seems to be enough. If you have more specific case, can you create a pull-request with a test which describe your desirable behaviour?

ir4y avatar Jun 26 '17 03:06 ir4y

Hmm - it doesn't seem very clear to me at a first pass-thru of PR #10 - how do we define what field is the custompk even if its not truly primarykey related? i.e., if I'm doing a POST, and I want to "get_or_create" based on a specific unique=True field, how do I designate that field?

Example :

I'm performing a POST to /jobs/, that has Company nested (ForeignKey relation). I want to get_or_create if the Company with name already exists. Company.name has "unique=True" set. However, this post will always only try to "create()". What do I set to designate "name" as something unique that should be used? (I can get it to work when manually setting up JobSerializer.create() and JobSerializer.update(), but it's unclear the proper way to do this via drf-writeable-nested.

curl -X POST \
  http://localhost:8000/rest/jobs/ \
  -d '{
	"company": {
		"name": "Company Name ABC"
	},
	"name": "Job Name",
	"description": "This job involves...",
}'

oliver-zhou avatar Jun 28 '17 19:06 oliver-zhou

@oliver-zhou I think you are addressing your problem the wrong way. drf-writable-nested propagates saving (i e create/update) to the sub-serializer, this is the logical thing to do, the package is supposed to nest serializers after all. So the simple, correct and only way to handle this (if you don't know the company pk) is by writing a custom CompanySerializer.create method. If you analyze the situation, your problem is about getting or creating Company objects so naturally your CompanySerializer should handle it.

tjwalch avatar Jul 05 '17 07:07 tjwalch

This package propagates the save; it's not necessarily "the logical thing to do". Our DIY version of writable nested serializers cascades in the to_internal_value call. On paper, this approach makes more sense because to_internal_value converts top-level data (e.g. JSON) into the internal representation for the field (i.e. a model instance for a nested serializer). But that debate is not directly relevant to the question posed by @oliver-zhou (and one we deal with internally).

There are a bunch of situations where matching on a pk isn't necessary (or ideal). This package hardcodes the assumption that the data structure will include a pk (or represents a new instance). Here are some examples of when it isn't necessary (and may not be ideal):

  • When representing composite primary keys, the simplest solution in Django is a surrogate key (Django's PK). However, the surrogate key isn't meaningful to the rest of the universe. A get_or_create really needs to match on the composite key fields.
  • DRF's SlugRelatedField and HyperlinkedRelatedField already identify related objects by non-PK fields. It makes sense to be able to do the reverse in a writeable serializer.
  • A concrete example of the previous bullet is HAL-JSON which identifies all objects by URI so there are no PKs carried around. If the object is local to your Django instance, there is a trivial mapping between the two. In a microservice architecture, this isn't necessarily a safe assumption. Big, long URIs are murder on caching and indexes so a surrogate key is FAR better for internal processing.
  • The OneToOne key relationship is a special case (albeit with a workaround in the code). The PK of the second object can be inferred from the first without an explicit ID.

claytondaley avatar Nov 15 '18 21:11 claytondaley

I need this functionality as well. Any progress or thoughts? @claytondaley, I saw #57.

I asked about it on stackoverflow.

One example of desired behavior is that if the example user data on the front page were saved twice, there should be no new objects created, if Site were uniquely identified by url, User by username, AccessKey by key, and Avatar by image.

I'm not saying this is the only case, but I think it is common that a model object can identified by some combination of fields that have external meaning (unlike the pk), and some external entity wants to get-or-create those things.

dfrankow avatar Apr 18 '19 22:04 dfrankow

Per my last comment in #57, the version found in my fork is working, but not fully integrated into this package. Is this a new project that you can try using my new classes without the integration?

claytondaley avatar Apr 19 '19 01:04 claytondaley

Should I try your fork? I'd like a solution, and it's a new project, but we need to (eventually) use it in production. I notice your fork is "37 commits ahead and 17 commits behind," i.e. a true fork. Do you think you'll maintain it? I don't mean to be unfriendly, just trying to figure out the right way forward for us.

dfrankow avatar Apr 19 '19 16:04 dfrankow

Per #57, the plan (approved in principle by these maintainers) is to merge it into this project. My fork is only "behind" because the code is completely independent right now (so there's no lost functionality). If the code works for you and you need me to expedite the merge (e.g. so you have pip access) I can spend some extra time getting it across the finish line.

claytondaley avatar Apr 19 '19 16:04 claytondaley

I cannot and would not force you to do work for free. However, it would increase my confidence that all issues have been resolved if it were in pip. We're certainly using pip for our project.

I'll try your fork first (maybe next week), to see if I understand it.

By the way, I looked on your personal site for an email to chat outside this issue, and it says "[email protected]". :)

dfrankow avatar Apr 19 '19 17:04 dfrankow

I need this functionality as well. Any progress or thoughts? @claytondaley, I saw #57.

I asked about it on stackoverflow.

One example of desired behavior is that if the example user data on the front page were saved twice, there should be no new objects created, if Site were uniquely identified by url, User by username, AccessKey by key, and Avatar by image.

I'm not saying this is the only case, but I think it is common that a model object can identified by some combination of fields that have external meaning (unlike the pk), and some external entity wants to get-or-create those things.

I think you can implement upsert behavior using custom logic in your viewset. But in general, if you post one object without pk twice it won't work with this package. Could you please try #57 and give feedback, it will be very helpful.

ruscoder avatar Apr 19 '19 17:04 ruscoder

@oliver-zhou I think you are addressing your problem the wrong way. drf-writable-nested propagates saving (i e create/update) to the sub-serializer, this is the logical thing to do, the package is supposed to nest serializers after all. So the simple, correct and only way to handle this (if you don't know the company pk) is by writing a custom CompanySerializer.create method. If you analyze the situation, your problem is about getting or creating Company objects so naturally your CompanySerializer should handle it.

How do you get this to work with sub-sub-serializers? If I patch the create method of my sub-serializer with get_or_create, this will only work if the sub-serializer is itself not a WritableNestedModelSerializer else it will overwrite the create method provided by WritableNestedModelSerializer...

joshsny avatar May 17 '21 09:05 joshsny