lemmy icon indicating copy to clipboard operation
lemmy copied to clipboard

Use ReadySet for increased database read performance

Open StyMaar opened this issue 2 years ago • 5 comments

Requirements

  • [X] Is this a feature request? For questions or discussions use https://lemmy.ml/c/lemmy_support
  • [X] Did you check to see if this issue already exists?
  • [X] Is this only a feature request? Do not put multiple feature requests in one issue.
  • [X] Is this a backend issue? Use the lemmy-ui repo for UI / frontend issues.

Is your proposal related to a problem?

Instances are easily overloaded when a specific post is shared over social medias and many people come to see it.

Describe the solution you'd like.

ReadySet is a caching layer that goes on top of your database and make the reads much faster. The cache is automatically updated when a write comes.

Describe alternatives you've considered.

Hand-rolling a cache using a key-value store system, but this is much more work, and brittle at the same time.

Additional context

No response

StyMaar avatar Jul 01 '23 04:07 StyMaar

In general this sounds like an interesting tool. Historically "magic" database caching never worked well for me personally, though seeing the concepts and the source of this solution would make me try again.

One thing to keep in mind: The data-update mechanism seems to be based on logical postgres replication. This means that using the cache would be subject to the same issue as using a readonly follower database: it can take seconds for a change you write to be readable on the follower, or in this case, on the cache (I didn't test this specific to this caching layer, this judgement is purely based on the described architecture).

That being said: having readonly follower databases, or this cached layer, as way of scaling is a very valid approach. Typically this could be implemented as optional secondary database(s), so each task or handler can decide if they need 100% up-to-date data, or can read from the faster but slightly outdated follower (or cache). Every instance admin can set this up as they wish.

syphar avatar Jul 01 '23 06:07 syphar

It uses PosgreSQL replication features to follow the logs and determine when cached content needs updated. This approach requires the least amount of surgery to Lemmy's Rust code, like you say in the "alternates". It also starts out altering nothing and passes data to the live database, you determine how caching gets enabled.

The lemmy_server code right now is repeating a lot of cacheable queries for things like the local instance's own settings, language codes, who an administrator is, the "trending" communities, etc. It's beyond crisis.

EDIT: revisiting this 24 hours later

Reading here, https://docs.readyset.io/reference/sql-support/ ReadySet is going to double your server storage requirements, and it also runs a second instance of PostgreSQL from what I understand, so your RAM usage is going to go up too. I wonder how you would allocate memory to which PostgreSQL for when queries that aren't cached miss.

RocketDerp avatar Jul 01 '23 12:07 RocketDerp

@RocketDerp

and it also runs a second instance of PostgreSQL from what I understand, I'm not sure where you got that from.

@syphar

This means that using the cache would be subject to the same issue as using a readonly follower database: it can take seconds for a change you write to be readable on the follower

This is indeed the case

StyMaar avatar Jul 03 '23 13:07 StyMaar

discussion with developers of ReadySet here: https://github.com/readysettech/readyset/discussions/112

looks like I was wrong/confused, ReadySet will interface with the primary postgress database and the differing credentials are explained.

RocketDerp avatar Jul 03 '23 13:07 RocketDerp

Reading a little more about ReadySet, it wouldn't automatically start caching but only do this for explicitly defined queries.

So we would be back to needed code changes where lemmy needs to tell ReadySet which queries to cache, and which not. Or every instance admin does this on their own.

syphar avatar Jul 04 '23 05:07 syphar

This sounds like something which can be deployed by instance admins without any changes to Lemmy code. So no need for an issue here. Also see https://github.com/LemmyNet/lemmy/issues/3005.

Nutomic avatar Oct 20 '23 11:10 Nutomic