repc icon indicating copy to clipboard operation
repc copied to clipboard

eliminate diffserver and move to diffsync

Open phritz opened this issue 4 years ago • 13 comments

Proposal: https://www.notion.so/Differential-Client-View-12be3a636c9f404b88d49ecbd100a694

TODOs in repc:

  • [x] remove checksum from pull and checksum itself. @phritz
    • it is a question if we do this in a breaking vs nonbreaking way (eg, remove checksum from commit vs not). think it makes sense to do this in a breaking way bc there are commit changes beyond checksum (storing cookie for example). which precipitates the question of what to do with existing customers and how to manage this kind of change in the future.
  • [x] split BeginSync into separate push and pull calls. Call them TryPush and TryPull. Move pull from mod.rs into its own source file. it's a good question as to whether to do this work before the next item, or as part of it, whoever is doing the work can see which seems like less work. @arv
    • [x] Update tests to only test pull and push respectively @arv
  • [x] eliminate syncinfo if it exists
  • [x] replace batch_push_info and client_view_info with http_request_info and have it be returned by tryPush and beginTryPull @arv
  • [x] Update TryPush to have: push_url, push_auth (this is just a rename) @arv
  • [x] Update TryPull to have: pull_url, pull_auth (just a rename). @arv
    • [x] Also rename TryBeginPull to BeginTryPull.
  • [x] ensure js tracks the renames and changes above and has a sync integration test
  • [x] replace the existing pullrequest/response and clientviewrequest/response with a new pullrequest. This pullrequest should have at minimum: clientid, last_mutation_id, cookie, pullversion (formerly version). Pullresponse should have cookie, lastmutationid, patch. Cookie goes into the commit. Remove the concept of stateid everywhere (or replace it with cookie and ensure cookie does not have semantics). @phritz
    • [x] data layer should not send clients back in time
    • [x] ensure this change is tracked in js
  • [x] Change patch value to be JSON, not a string because we no longer care about checksumming. @phritz
  • [x] add a pushversion to pushrequest so we can change it when we need to (we have already used the pullversion extensively) @arv
  • [x] add an appversion to trypush and trypull and include it in the pull and push requests so they can version their client view
    • [x] ensure this change is tracked in js
  • [x] rename sync id to request id and use a new one every time we do a pull or a push and include it in the push and pull requests (so they can tie their client side logs to server logs. we have used this extensively on our end) @arv
  • [x] rename batchpushrequest/respone => pushrequest/response. @arv
  • [x] remove from begintrypullrequest diffserverurl/auth @phritz
    • [x] ensure this change is tracked in js
  • [x] Remove PullResponse http_request_info. BeginTryPullResponse still needs it though @arv
  • [x] ensure log lines are consistent with diffsync
    • [x] log when we complete a pull (old+new lmid, cookie, and maybe value hash)
  • [x] cut diffs out of release and ensure it works. eg see https://github.com/rocicorp/repc/blob/7266de2338ea831e05c79700bdeaeb564be1fb3a/tool/bump/bump.go#L77
  • [x] Take a pass on naming. We are sticking with camelCase for now.

Other TODOs:

  • [x] update sample applications to reflect new interfaces, introducing a cookie timestamp to relevant tables
  • [ ] figure out how to migrate existing customers and maybe file an issue for figuring out commit versioning strategy going forward
  • [x] clean up diffserver issues. Leave bugs, but close issues that are speculative (eg, "we should make the diffserver optional") and remove diffserver issues from milestones.
  • [x] add deprecation warning to diffs repo
  • [x] take a pass over docs eliminating reference to diffserver and ensuring names of things are consistent with above
  • [ ] take a pass over designdoc ensuring that it reflects naming and concepts from above
  • [x] suggest data layer clientview lastmutationid and maybe cookie timestamp checking logic somewhere, maybe in checklist
  • [ ] strongly consider working on an integration test once these changes are done

phritz avatar Feb 12 '21 21:02 phritz

Some questions the code reviews triggered (sorry these are late):

This pullrequest should have at minimum: clientid, last_mutation_id, cookie, pullversion (formerly version) add a pushversion to pushrequest so we can change it when we need to (we have already used the pullversion extensively) @arv

I'm wondering if maybe we should just send Replicache's current semver (from the js lib) instead.

The reason that I like this is:

  • (a) semver is supposed to encapsulate breaking changes in the major version. Changing the server protocol would definitely count, so any (breaking) protocol change should definitely result in a change to the library's major version, no matter what
  • (b) It's possible that the Replicache version is more intuitively meaningful to developers (version 2 of Replicache adds diff sync, version 1 didn't have it)
  • (c) It can also represent non-breaking changes the same way.
  • (d) fewer versions floating around seems 'easier' to me somehow.

On the other hand there's something to be said for being specific I suppose.

add an appversion to trypush and trypull and include it in the pull and push requests so they can version their client view

We are going to want developers to be able to parameterize the pull requests at least in a general way. For example to:

  • change the "window" of data they want to sync
  • specify the priority of data they are pulling

So given that I'm not sure that we need a specific API for sending the app version yet. Can we hold off until we sketch out these other features?

aboodman avatar Feb 27 '21 20:02 aboodman

I'm wondering if maybe we should just send Replicache's current semver (from the js lib) instead.

I retract this idea. It's too weird for the rust library to not be self-contained. When I see a PR go by that breaks push or pull, I want to see that we did something specific about it, and not rely on the fact that the JS library version will be properly bumped later.

This brings me to my next question though:

What are the semantics of this number? Is it a breaking change? Any kind of change?

I feel like it's conventional in web service API design to not have a version number for non-breaking changes, so I assume this is only for breaking changes.

Edit Deleted idea about communicating version in URL. Too cute.

aboodman avatar Feb 27 '21 20:02 aboodman

+1 to keeping the JS API version and the protocol version independent. The JS API version has changed many times in the last year (5 times).

Yes, the protocol version is not backwards compatible except for us keeping it compatible enough that the server can do version detection and branch on that.

arv avatar Mar 01 '21 16:03 arv

I think we should use an HTTP header for the version instead of putting it in JSON. If we put it in JSON it gets harder to change the format of the data (from JSON to MessagePack for example). It is still doable but then we need to also start looking at the content type header.

arv avatar Mar 01 '21 18:03 arv

What are the semantics of this number? Is it a breaking change? Any kind of change?

The convention is and has been that we increment it for a backwards-incompatible change and new feature activation. Here's a list of such changes (and their new versions) up to this point for this particular message: https://github.com/rocicorp/diff-server/blob/cd1fca9b2e9f37a1631e5bef75c3d906f4298fd0/serve/types/types.go#L10

I think we should use an HTTP header for the version

Generally agree that auth and message encoding are two things that typically are best sent out of band of messages (eg, HTTP header, or during connection setup, etc). Practically speaking having things in the headers AND in the message has been a PITA up to this point and so my preference has been to stick everything but auth in the message -- there's just one place to look. My sense is to just stick stuff in the message for now to keep things simple and if/when we want to change encodings we talk about how that should work. We'll have more info then anyway, eg about whether we are even going to go over HTTP for messages. I don't actually care too much tho, except to not spend time on this right now so if you feel strongly, go for it.

We are going to want developers to be able to parameterize the pull requests at least in a general way. For example to: change the "window" of data they want to sync specify the priority of data they are pulling So given that I'm not sure that we need a specific API for sending the app version yet. Can we hold off until we sketch out these other features?

The probability is 1 that someone launching with replicache needs to specify what schema of clientview their app understands. Otherwise they cannot change it. So they need some mechanism, giving them a string to fill seems like a fine way to start doing that. If there is a fancier way we want to do that then we increment the pull request version when we add that fancier method, which might obviate the app version. But seems like the app must have the ability to signal to the pull endpoint what kind of client view data the app expects to receive. Those who don't want to use it don't have to. I don't care if we call it something else, maybe client_view_schema_version is more specific.

phritz avatar Mar 01 '21 19:03 phritz

+1 to keeping the JS API version and the protocol version independent. The JS API version has changed many times in the last year (5 times).

I'm not sure if you're agreeing with me. I think it is critical that if we change the server protocol backward incompatibly, we also bump the major version of the js library.

That is the js lib version should encapsulate the protocol version.

aboodman avatar Mar 01 '21 19:03 aboodman

@phritz - httpRequestInfo will go away on PullResponse right?

aboodman avatar Mar 01 '21 20:03 aboodman

No. You need a signal for 4xx.

On Mon, Mar 1, 2021, 10:05 AM Aaron Boodman [email protected] wrote:

@phritz https://github.com/phritz - httpRequestInfo will go away on PullResponse right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rocicorp/repc/issues/290#issuecomment-788233799, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGLYOEDCKVD7ABMHE3J7DTBPXRVANCNFSM4XRL4J3Q .

phritz avatar Mar 01 '21 20:03 phritz

Oh wait, yes, from pullresponse!

On Mon, Mar 1, 2021, 10:21 AM Fritz Schneider [email protected] wrote:

No. You need a signal for 4xx.

On Mon, Mar 1, 2021, 10:05 AM Aaron Boodman [email protected] wrote:

@phritz https://github.com/phritz - httpRequestInfo will go away on PullResponse right?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/rocicorp/repc/issues/290#issuecomment-788233799, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABGLYOEDCKVD7ABMHE3J7DTBPXRVANCNFSM4XRL4J3Q .

phritz avatar Mar 01 '21 20:03 phritz

The probability is 1 that someone launching with replicache needs to specify what schema of clientview their app understands. Otherwise they cannot change it.

OK I guess this is not strictly true. The data layer could return a client view schema version as part of the client view itself, and onPull the app could check that the clientview is a version it understands, otherwise do $something if not (and potentially do some undefined behavior if the user or app is doing stuff after pull but before onpull does $something).

But maybe more importantly the app, say a dev version, can't ask for a new schema explicitly without a field like this. And not just even a new schema, potentially new features can't be activated for newer versions without something like this.

phritz avatar Mar 01 '21 22:03 phritz

OK wrt docs in particular, I'm pushing a bigger doc refresh to a different bug, so the changes for this release will be more targeted. Though this does integrate the main suggestion from: https://www.notion.so/Small-docs-reorg-PAUSE-REVIEW-INSTEAD-SEE-PROBLEM-STATEMENT-under-this-workspace-22f89863ff8949149a09f4da4dbde7d6.

Looking at https://js.replicache.dev/ and https://github.com/rocicorp/replicache, these are the changes I'm planning:

  • [x] Remove references to samples that no longer exist and replace with replidraw
    • Note: I do see @arv point that it would be nice to have a smaller/simpler todo style demo here. Argh, maybe we do need three samples ideally: a trivial todo sample for people to clone and get going with, replidraw, and repliear. Anyway this can be a separate bug.
  • [x] Clean up Replidraw to be easy to clone (meaning add instructions for setting up aws, etc)
  • [x] Combine the existing 'server setup' guide and 'getting started' docs and rename to 'integration guide'. Update to not need diff server. Put this in the replicache repo.
  • [x] Update the top-level README in the Replicache repo to be a directory to other doc resources
  • [x] Fix the top-level README in js.replicache.dev to not be a directory :)

aboodman avatar Mar 19 '21 16:03 aboodman

Only further suggestion I have to your list is to take another look at the first two sections of the designdoc and see if you find they should be updated to be more consistent with new product positioning. As they are, they feel very offline-firsty.

phritz avatar Mar 24 '21 23:03 phritz

offline-thirsty you mean.

I'll show myself out.

aboodman avatar Mar 25 '21 20:03 aboodman