fix: token LocalCache to DistributedCache
Fix #46165
In a load-balanced platform with multiple application servers, local token caching does not work effectively and leads to login loops. We believe that by making this change, we can use a distributed cache instead.
- Resolves: In a load-balanced platform without sticky sessions, local token caching does not work and causes a login loop.
Summary
Currently, the token is stored in the local cache. In the case of a load-balanced Nextcloud platform without Sticky Sessions, it is necessary to store the token in the distributed cache if it exists.
Checklist
- Code is properly formatted
- Sign-off message is added to all commits
Currently, the token is stored in the local cache. In the case of a load-balanced Nextcloud platform without Sticky Sessions, it is necessary to store the token in the distributed cache if it exists.
The token is stored in oc_authtokens and the cache there to reduce database queries.
Changing local to distributed to seems like a hack to fix something that's broken somewhere else.
Distributed cache is slow, I would prefer to avoid it.
Can you outline how the local cache can lead to login loops?
Sorry, we try to explain the problem in the issue.
What we notice in the flow is that when requests come back to Nextcloud after passing through user_saml, the node that generated the token can handle the requests, whereas the second node will return a 401 and redirect to user_saml, which is already authenticated. It then returns to Nextcloud, which will generate a new token on the node handling the requests, but when a request arrives on the second node, it returns a 401 and the loop is initiated.
When the token is stored in Redis, all nodes retrieve the information. Alternatively, if a node cannot find the token, it should store it in its local cache.
whereas the second node will return a 401
I don't yet see why this would happen. The second node doesn't have the app token cached, true, but it also doesn't have a negative cache entry. So it will go to the database and should find the row, right?
Yes, it seems that the record in the oc_authtoken table disappears immediately when the issue arises. However, logically, the second node should make a request to the database to update its local cache. In the case of distributed caching, there is ultimately only one cached record for the entire cluster. This is just for discussion because we do not know all the impacts behind it.
We have conducted upgrade tests and detected that the issue appeared between version 27.1.9 and 27.1.10, if that helps to understand the reason!
We detected that right after the SSO login (redirect from SSO to Nextcloud), we get this error and then the redirection loop starts, but the error appears only once:
{
"reqId": "GMZg7KyZVNMpJ9dUVL58",
"level": 3,
"time": "2024-06-27T17:00:15+02:00",
"remoteAddr": "192.168.1.1",
"user": "--",
"app": "core",
"method": "GET",
"url": "/index.php/apps/theming/theme/light.css?plain=0&v=81b4a035",
"message": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
"userAgent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:127.0) Gecko/20100101 Firefox/127.0",
"version": "29.0.3.4",
"exception": {
"Exception": "OC\\Authentication\\Exceptions\\InvalidTokenException",
"Message": "Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
"Code": 0,
"Trace": [
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
"line": 168,
"function": "getTokenFromCache",
"class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
"type": "->",
"args": [
"*** sensitive parameters replaced ***"
]
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
"line": 249,
"function": "getToken",
"class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
"type": "->",
"args": [
"*** sensitive parameters replaced ***"
]
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/public/AppFramework/Db/TTransactional.php",
"line": 63,
"function": "OC\\Authentication\\Token\\{closure}",
"class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
"type": "->",
"args": [
"*** sensitive parameters replaced ***"
]
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
"line": 248,
"function": "atomic",
"class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
"type": "->"
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/Manager.php",
"line": 172,
"function": "renewSessionToken",
"class": "OC\\Authentication\\Token\\PublicKeyTokenProvider",
"type": "->"
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/User/Session.php",
"line": 941,
"function": "renewSessionToken",
"class": "OC\\Authentication\\Token\\Manager",
"type": "->"
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/base.php",
"line": 1132,
"function": "loginWithCookie",
"class": "OC\\User\\Session",
"type": "->",
"args": [
"*** sensitive parameters replaced ***"
]
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/lib/base.php",
"line": 1039,
"function": "handleLogin",
"class": "OC",
"type": "::"
},
{
"file": "/var/www/***URL-PLATEFORM***/htdocs/index.php",
"line": 49,
"function": "handleRequest",
"class": "OC",
"type": "::"
}
],
"File": "/var/www/***URL-PLATEFORM***/htdocs/lib/private/Authentication/Token/PublicKeyTokenProvider.php",
"Line": 197,
"message": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3",
"user": "***LOGIN***",
"exception": {},
"CustomMessage": "Renewing session token failed: Token does not exist: a46eaecb5ab23aa00bce568fdaffbe0de1e1a49c900142d1e23c2c720800c132382cbd3e7c9e74a206b94388c5e43cf7912a5cae01b72ddfbf2edde4843cadc3"
}
}
Hello,
It seems that there is a problem in the getToken() function of the PublicKeyTokenProvider.php file When connecting, this function is called but the record is not found in the database or in the cache (because of loadbalancing and local cache) The token hash is invalidated ($this->cacheInvalidHash($tokenHash);) which explains the loop.
If we comment on this line $this->cacheInvalidHash($tokenHash); authentication works with local cache
Hope this information helps you find the cause.
Hello there, Thank you so much for taking the time and effort to create a pull request to our Nextcloud project.
We hope that the review process is going smooth and is helpful for you. We want to ensure your pull request is reviewed to your satisfaction. If you have a moment, our community management team would very much appreciate your feedback on your experience with this PR review process.
Your feedback is valuable to us as we continuously strive to improve our community developer experience. Please take a moment to complete our short survey by clicking on the following link: https://cloud.nextcloud.com/apps/forms/s/i9Ago4EQRZ7TWxjfmeEpPkf6
Thank you for contributing to Nextcloud and we hope to hear from you soon!
(If you believe you should not receive this message, you can add yourself to the blocklist.)
A different approach https://github.com/nextcloud/server/pull/46398
Close because fix by https://github.com/nextcloud/server/pull/46398