Remove Websocket code from Alfred
Description
Removing the socket.io server from Alfred. Going forward, only Nexus will handle the deltaStream. That is, it will be the only websocket microservice for the client connections.
In Nexus lambda, we do not need connectionCountLogger anymore. Its not needed as we have the new connectionCount metric using socket. Please remove its reference from nexus and also remove the connectionCountLogger class/interface as well. Its related config should be removed as well.
Need to update localOrdererConnection.ts files as well.
take a look at server/routerlicious/packages/routerlicious-base/src/test/alfred/io.spec.ts as well
@znewton , could you please review this PR from client-code point of view. If there are any logs/comments that we should update on client side to reflect this shift to nexus, we should do that.
⯅ @fluid-example/bundle-size-tests: +187 Bytes
| Metric Name | Baseline Size | Compare Size | Size Diff |
|---|---|---|---|
| aqueduct.js | 511.14 KB | 511.18 KB | ⯅ +44 Bytes |
| connectionState.js | 680 Bytes | 680 Bytes | ■ No change |
| containerRuntime.js | 245.01 KB | 245.03 KB | ⯅ +22 Bytes |
| loader.js | 170.65 KB | 170.67 KB | ⯅ +22 Bytes |
| map.js | 46.53 KB | 46.54 KB | ⯅ +11 Bytes |
| matrix.js | 148.68 KB | 148.69 KB | ⯅ +11 Bytes |
| odspDriver.js | 97.34 KB | 97.37 KB | ⯅ +33 Bytes |
| odspPrefetchSnapshot.js | 42.28 KB | 42.3 KB | ⯅ +22 Bytes |
| sharedString.js | 167.42 KB | 167.43 KB | ⯅ +11 Bytes |
| sharedTree.js | 334 KB | 334 KB | ■ No change |
| Total Size | 1.87 MB | 1.87 MB | ⯅ +187 Bytes |
Baseline commit: c8e6011bdbfc1253d220f1d16cd142efeb13ca98
Generated by :no_entry_sign: dangerJS against dc52817a53a6e34729921d8bdbe6d8767801901f
One general concern from the FF side here: if alfred stops being the one handling deltaStream connections, doesn't the Helm chart need to be updated so that it now spins up Nexus as well to take over? @alexvy86
This is a good catch. Do you know if there is a pipeline that deploys this or what would be a good way to validate this kind of deployment? Is there anyone using these charts directly?
This is a good catch. Do you know if there is a pipeline that deploys this or what would be a good way to validate this kind of deployment? Is there anyone using these charts directly?
Yes, the FF team uses them to deploy an internal instance of routerlicious used by our end-to-end tests. The combination of chart - routerlicious and deploy - routerlicious package and deploy the helm charts to an internal AKS cluster that our end-to-end tests pipeline targets. However, there's two environments in that AKS cluster; we automatically deploy to one that is not used by anything (named "ppe"; it's there mostly just to validate that the deployment doesn't fail and that the containers on their own aren't crashing), but deployment to the one that our e2e tests pipeline actually uses requires a manual approval (which I can do).
To validate that the environment created by the charts works as it should, what we can do is create a branch where the e2e tests pipeline targets the ppe environment, so we can see if they still pass after the changes to the charts. I can prepare that branch and hand it over to you so you can run it as you need to make sure tests are passing at the same rate they do when they target the "prod" (non-ppe) environment in AKS (usually 99+% pass rate). Let's sync offline to coordinate.
One thing I just ran into, is that this split might require more significant changes on our end to support, e.g. https://github.com/microsoft/FluidFramework/blob/main/packages/test/test-drivers/src/routerliciousTestDriver.ts#L57-L62
One thing I just ran into, is that this split might require more significant changes on our end to support, e.g. https://github.com/microsoft/FluidFramework/blob/main/packages/test/test-drivers/src/routerliciousTestDriver.ts#L57-L62
For this the discoveryResponse should override the parameters declared here. The comment a few lines above seems to say so for the other properties. Nexus relies on deltaStreamUrl being set
Not sure if I missed this in the above discussion, but this change will mean that NGINX re-routing of Alfred socket endpoints to Nexus will be a requirement to ensure that older versions of routerlicious-driver can still communicate with R11s. I'm not sure if FF's deployment of R11s has NGINX included; if it does, it needs to be updated with routing rules, and if not, we need to solve for that otherwise compat tests will probably fail