framework icon indicating copy to clipboard operation
framework copied to clipboard

Persistent Spike in RDS Proxy Database Connections Since L10 -> L11

Open Nutickets opened this issue 9 months ago • 13 comments

Laravel Version

11.44.2

PHP Version

8.2.18

Database Driver & Version

MySQL 8.0.40 on RDS via RDS Proxy

Description

When we deployed our Laravel 10 -> 11 upgrade in April, there was a spike in DatabaseConnections through our RDS proxy which has since persisted (screenshot included below). We did not deploy any other changes to the AWS infrastructure - there were a high number of 3rd party Laravel package upgrades alongside it as expected with any major framework upgrade, however there's nothing within those upgrades that stand out as something that could introduce such a universally impactful change.

We have suspicions that it could be related to this change: https://github.com/laravel/framework/pull/50044/files but haven't yet A/B tested this as it's not feasible for us to rollback to L10 at this stage. The connections seem to be idle and it may be possible for us to workaround the issue by playing with the RDS proxy idle connection timeout settings, but preferably would like to understand the root cause.

Image

Steps To Reproduce

I'm not sure exactly how to describe reproduction steps at this time, just raising awareness in case there are other people who have noticed a similar outcome or can perhaps educate me on how to navigate this situation as it feels possibly like a time bomb since our application experiences extremely high traffic surges during large event launches.

Nutickets avatar May 07 '25 12:05 Nutickets

Could this be related to the introduction of cache table? We noticed that in laravel 12 vs laravel <=10 there are more queries executed.

Example L12

    "0.41 ms, sql: select * from `cache` where `key` in ('laravel_cache_operationsIndexesForFiltering')",
    "1.01 ms, sql: select count(*) as aggregate from `operations`",
    "0.3 ms, sql: select * from `operations` limit 10 offset 0"

vs L10

    "60.24 ms, sql: select count(*) as aggregate from `operations`",
    "0.72 ms, sql: select * from `operations` limit 10 offset 0"

The cache implies also an insert that might not be caught by the DebugQueryListener.

in env these 2 default to database not to file in L11 and L12: CACHE_STORE=database 'default' => env('CACHE_STORE', 'database'), SESSION_DRIVER=database 'driver' => env('SESSION_DRIVER', 'database'),

macropay-solutions avatar May 07 '25 16:05 macropay-solutions

Could this be related to the introduction of cache table? We noticed that in laravel 12 vs laravel <=10 there are more queries executed.

We do not use database as a cache store

Nutickets avatar May 07 '25 16:05 Nutickets

@Nutickets Can you confirm you have in your env CACHE_STORE=other than database? (Also SESSION_DRIVER=other than database?)

In L10 the env name is CACHE_DRIVER while in 11 is CACHE_STORE.

This change defaults the cache to database if you did not declare CACHE_STORE after upgrade in env.

PS. Thanks for this issue. We found out what was not ok with our demo page for L 12 thanks to you.

macropay-solutions avatar May 07 '25 16:05 macropay-solutions

Are you sure that this isn't caused by changes made to the database.php config file during the upgrade? In particular, the default framework configuration file doesn't enable PDO::ATTR_PERSISTENT.

GrahamCampbell avatar May 08 '25 23:05 GrahamCampbell

Are you sure that this isn't caused by changes made to the database.php config file during the upgrade? In particular, the default framework configuration file doesn't enable PDO::ATTR_PERSISTENT.

The only changes to our database.php file during the upgrade was the following:

Image

We have not explicitly set a value for PDO::ATTR_PERSISTENT in our project.

Since opening this issue, we've manually reverted the L11 changes to MySqlConnector.php via the service container

// AppServiceProvider.php
public function register()
{
    // HOTFIX: *temporarily* revert the mysql connector logic to what it was pre-laravel-11. More info inside.
    $this->app->bind('db.connector.mysql', MySqlL10Connector::class);
}

The contents of MySqlL10Connector.php were taken from here.

As soon as we deployed that change, the idle database connections started dropping and continued to do so for the following 12 hours until it reached the baseline level that we are used to seeing pre L11 upgrade:

Image

Nutickets avatar May 09 '25 09:05 Nutickets

I believe you should try enabling PDO::ATTR_PERSISTENT in the options config. Can you share you full configuration, without the hostname and password please?

GrahamCampbell avatar May 09 '25 10:05 GrahamCampbell

It kinda looks a lot like there's an RDS proxy bug here.

GrahamCampbell avatar May 09 '25 10:05 GrahamCampbell

Even on Laravel 10 you were opening and closing connections excessively but not using persistent mode. On Laravel 11, it looks like there's some kind of RDS proxy bug that's leaving around dangling connections, possibly due to the more efficient connection bootstrap process uncovering a bug in RDS proxy.

GrahamCampbell avatar May 09 '25 10:05 GrahamCampbell

I believe you should try enabling PDO::ATTR_PERSISTENT in the options config. Can you share you full configuration, without the hostname and password please?

'mysql' => [
    'driver' => 'mysql',
    'charset' => 'utf8mb4',
    'collation' => 'utf8mb4_unicode_ci',
    'prefix' => '',
    'prefix_indexes' => true,
    'strict' => false,
    'engine' => null,
    'timezone'  => '+00:00',
    'options' => extension_loaded('pdo_mysql') ? [
        // NB: for SSL RDS proxy connections, we don't need a cert because the proxy manages them using ACM.
        PDO::MYSQL_ATTR_SSL_CA => env('MYSQL_ATTR_SSL_CA'),
        PDO::MYSQL_ATTR_SSL_VERIFY_SERVER_CERT => env('MYSQL_ATTR_SSL_VERIFY_SERVER_CERT', false),
    ] : [],
],

Even on Laravel 10 you were opening and closing connections excessively but not using persistent mode.

True but we're delegating that responsibility of connection pooling to RDS proxy - at least thats our interpretation of why we may not need to use persistent mode.

On Laravel 11, it looks like there's some kind of RDS proxy bug that's leaving around dangling connections, possibly due to the more efficient connection bootstrap process uncovering a bug in RDS proxy.

I was totally ignorant to how RDS proxy pooled connections until this sudden spike tbh, but now I understand connections are 'pinned' (i.e. no longer re-usable) when certain SET SESSION foo=bar variables are set in MySQL. I assume all of the variables set by the MySqlConnector fall within the tracked variable list meaning they support RDS proxy connection pooling, but perhaps RDS proxy doesn't like the new approach of setting multiple variables in a single round trip?

Although I'd be surprised if it were that simple, because then 100% of RDS proxy users on L11 would have this same spiking problem that we experienced.

Nutickets avatar May 09 '25 12:05 Nutickets

I think you may be the only RDS proxy user who uses Laravel 11...

GrahamCampbell avatar May 09 '25 15:05 GrahamCampbell

True but we're delegating that responsibility of connection pooling to RDS proxy - at least thats our interpretation of why we may not need to use persistent mode.

That's not quite how that works. You still need a persistent connection between the PHP app and the proxy server.

GrahamCampbell avatar May 09 '25 15:05 GrahamCampbell

@GrahamCampbell we also use RDS proxy so we should not update from 10 then?

macropay-solutions avatar May 09 '25 15:05 macropay-solutions

@macropay-solutions I don't think that's necessarily true. With the right config, the upgrade should be fine.

GrahamCampbell avatar May 10 '25 05:05 GrahamCampbell