flame icon indicating copy to clipboard operation
flame copied to clipboard

Adds mounts to fly backend (yep, the third one)

Open terrcin opened this issue 6 months ago • 2 comments

I've had a sudden need for adding mounts to Fly FLAME servers to shift infrequent long running CPU intensive data exports off our main app machines and into the background where they won't bother anybody. So I'm picking up the torch for this work.

To achive this i've initially forked from https://github.com/phoenixframework/flame/pull/22 to maintain @benbot's commit credit as they did the initial heavy lifting for this work. I've then updated to the latest resolving all merge conflicts, which then didn't compile as the way http requests are made has changed.

My contribution to this PR is resolving the compile errors and then making a bunch of improvements to add resilience:

  • refactored FlyBackend.http_post! into FlyBackend.http_request! to add GET support
  • update FlyBackend.Mount to be parsed the same way that FlyBackend is
  • additionally filter Fly volumes by the region to ensure they match the machine spec if that's set
  • also check the volume's "host_status" is "ok"
  • re-fetch list of volumes each time machine create request fails, to support this the body opt is now a function
  • shuffle volume list to prevent always picking the same one as create machine failure might be due to lack of capacity on volume's host

Note: we currently have this running on a QA server without issue and I plan future work to allow creating volumes if none are available to limit the number of big volumes hanging around unused

terrcin avatar Jul 04 '25 06:07 terrcin

Update: we've had this in our production and multiple feature environments for the past month and it's been working great so far.

terrcin avatar Aug 28 '25 05:08 terrcin

Hi would love to get this reviewed by someone so it's official!

We also have a sudden need for this. The ephemeral filesystem on fly.io is extremely slow which would defeat the purpose of FLAME because we have to read/write artifacts to disk.

For example a 200mb document ~25 seconds to write to "/tmp" and 4 sec to an actual volume.

Will report back on how this works for us in production.

venkatd avatar Sep 11 '25 13:09 venkatd