FluidFramework icon indicating copy to clipboard operation
FluidFramework copied to clipboard

Remove Websocket code from Alfred

Open alteut opened this issue 2 years ago • 10 comments

Description

Removing the socket.io server from Alfred. Going forward, only Nexus will handle the deltaStream. That is, it will be the only websocket microservice for the client connections.

alteut avatar Jan 12 '24 00:01 alteut

In Nexus lambda, we do not need connectionCountLogger anymore. Its not needed as we have the new connectionCount metric using socket. Please remove its reference from nexus and also remove the connectionCountLogger class/interface as well. Its related config should be removed as well.

sharmakhushboo avatar Jan 12 '24 19:01 sharmakhushboo

Need to update localOrdererConnection.ts files as well.

sharmakhushboo avatar Jan 18 '24 00:01 sharmakhushboo

take a look at server/routerlicious/packages/routerlicious-base/src/test/alfred/io.spec.ts as well

sharmakhushboo avatar Jan 18 '24 00:01 sharmakhushboo

@znewton , could you please review this PR from client-code point of view. If there are any logs/comments that we should update on client side to reflect this shift to nexus, we should do that.

sharmakhushboo avatar Jan 18 '24 00:01 sharmakhushboo

@fluid-example/bundle-size-tests: +187 Bytes
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 511.14 KB 511.18 KB +44 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 245.01 KB 245.03 KB +22 Bytes
loader.js 170.65 KB 170.67 KB +22 Bytes
map.js 46.53 KB 46.54 KB +11 Bytes
matrix.js 148.68 KB 148.69 KB +11 Bytes
odspDriver.js 97.34 KB 97.37 KB +33 Bytes
odspPrefetchSnapshot.js 42.28 KB 42.3 KB +22 Bytes
sharedString.js 167.42 KB 167.43 KB +11 Bytes
sharedTree.js 334 KB 334 KB No change
Total Size 1.87 MB 1.87 MB +187 Bytes

Baseline commit: c8e6011bdbfc1253d220f1d16cd142efeb13ca98

Generated by :no_entry_sign: dangerJS against dc52817a53a6e34729921d8bdbe6d8767801901f

msfluid-bot avatar Jan 19 '24 20:01 msfluid-bot

One general concern from the FF side here: if alfred stops being the one handling deltaStream connections, doesn't the Helm chart need to be updated so that it now spins up Nexus as well to take over? @alexvy86

This is a good catch. Do you know if there is a pipeline that deploys this or what would be a good way to validate this kind of deployment? Is there anyone using these charts directly?

alteut avatar Jan 22 '24 22:01 alteut

This is a good catch. Do you know if there is a pipeline that deploys this or what would be a good way to validate this kind of deployment? Is there anyone using these charts directly?

Yes, the FF team uses them to deploy an internal instance of routerlicious used by our end-to-end tests. The combination of chart - routerlicious and deploy - routerlicious package and deploy the helm charts to an internal AKS cluster that our end-to-end tests pipeline targets. However, there's two environments in that AKS cluster; we automatically deploy to one that is not used by anything (named "ppe"; it's there mostly just to validate that the deployment doesn't fail and that the containers on their own aren't crashing), but deployment to the one that our e2e tests pipeline actually uses requires a manual approval (which I can do).

To validate that the environment created by the charts works as it should, what we can do is create a branch where the e2e tests pipeline targets the ppe environment, so we can see if they still pass after the changes to the charts. I can prepare that branch and hand it over to you so you can run it as you need to make sure tests are passing at the same rate they do when they target the "prod" (non-ppe) environment in AKS (usually 99+% pass rate). Let's sync offline to coordinate.

alexvy86 avatar Jan 23 '24 16:01 alexvy86

One thing I just ran into, is that this split might require more significant changes on our end to support, e.g. https://github.com/microsoft/FluidFramework/blob/main/packages/test/test-drivers/src/routerliciousTestDriver.ts#L57-L62

alexvy86 avatar Jan 23 '24 16:01 alexvy86

One thing I just ran into, is that this split might require more significant changes on our end to support, e.g. https://github.com/microsoft/FluidFramework/blob/main/packages/test/test-drivers/src/routerliciousTestDriver.ts#L57-L62

For this the discoveryResponse should override the parameters declared here. The comment a few lines above seems to say so for the other properties. Nexus relies on deltaStreamUrl being set

alteut avatar Jan 26 '24 00:01 alteut

Not sure if I missed this in the above discussion, but this change will mean that NGINX re-routing of Alfred socket endpoints to Nexus will be a requirement to ensure that older versions of routerlicious-driver can still communicate with R11s. I'm not sure if FF's deployment of R11s has NGINX included; if it does, it needs to be updated with routing rules, and if not, we need to solve for that otherwise compat tests will probably fail

znewton avatar Jan 29 '24 19:01 znewton