Diego syncing unnecessarily twice
Issue
When running against mysql, timestamps do not have sub second resolution. This means that every single time a process is updated, it synchronizes with diego twice. The first time when the call is made, the second in the syncronization cycle. This is because the diego lrp and the data from capi's database on the timestamp will almost always differ. This doesn't seem to be the case for postgres based on capi unit tests, but havn't verified. This is not caught by the diego sync tests because we set the annotation for the tests by....using the value we've already stored in the database for the data syncing test. Given we're highly unlikely to update processes multiple times a second, I suggest we reduce the precision for this test to the second precision.
Context
Steps to Reproduce
Run CF. Push an application. See that capi updates diego twice.
Expected result
It should set it once
Current result
It updates twice.
process.updated_at: 1673287766.0
diego_lrp.annotation: 1673287766.4441805
$ while true; do cfdot desired-lrps | jq -r '.annotation + "\t" + (.routes["internal-router"] | last | .hostname)'; sleep 0.25; done
$ cf map-route dora apps.internal --hostname dora6
1673288114.5011706 dora6.apps.internal
1673288114.0 dora6.apps.internal
Possible Fixes
- Update capi to have the correct precision in the database equal to diego's
- Update capi to only send the value after it has gone through it's own database constraints
- always record at a second interval as soon as it comes in the request.
- fix our code here https://github.com/cloudfoundry/cloud_controller_ng/blob/e92ef7bef7df84bd72c8ab3411573c978695e8f1/lib/cloud_controller/diego/processes_sync.rb#L36 to have a better comparison
- fix our code there to have 1 second "error bars"
As there was already another recent change to support microsecond timestamps (see #3484) and all the MySQL versions we are supporting already do support this, I would go for option 1.
As the processes table is much larger than the asg_timestamps table (which only has a single record), I'm unsure about the migration.