pysaml2 icon indicating copy to clipboard operation
pysaml2 copied to clipboard

Normalize Cache Interface

Open ambsw-technology opened this issue 6 years ago • 11 comments

This package only offers in-mem and filesystem caches; neither work correctly in highly scalable systems. I (ignorantly) tried to give the client a Django cache and ran into issues. Unfortunately, the internal implementation has a proprietary interface e.g. set(self, name_id, entity_id, info, not_on_or_after=0). The presence of an entity_id (vs. a simple key-value) is not going to be supported by most (any?) standard caching interfaces.

I'd like to adjust the Cache class to support a more standard cache backend under-the-covers. The interface can be the same, but would be translated into plain k-v operations.

ambsw-technology avatar Sep 26 '19 15:09 ambsw-technology

If anyone needs to work around this issue, I have some helper classes for Django's cache system here, but they could be repurposed to non-Django caching as well. Note that pysaml2 expects mutated objects (i.e. dict) to automatically save so the sync behavior is critical to proper operation.

ambsw-technology avatar Sep 27 '19 00:09 ambsw-technology

Interesting, Where would you put your code? Which files Will be involved?

Is It ready for a pull request?

peppelinux avatar Oct 27 '19 13:10 peppelinux

This is how my DjangoCache shim is used (in Django). Per the Django documentation, reference Cache implementations can be found in django.core.cache.backends. I did some poking around and it looks like most of them (DB excluded) don't depend on actual Django features so they could be used in non-Django projects.

I can PR the DjangoCache shim (possibly under another name) if you want to include it in the package. In a perfect world, this project's base cache implementation (and all the places that use it) would be rewritten to use a more standard interface. I haven't done enough digging to figure out if that's possible in a backwards-compatible way.

ambsw-technology avatar Oct 27 '19 15:10 ambsw-technology

Hi @ambsw-technology I coded in oidc-op something that could be reused in pysaml2. I faced the native implementation of sso sessions in a way that they probably would be reused in other idpy projects.

With this workaround I can use the Django model, or any other storage if properly configured, as it would be a dictionary: https://github.com/peppelinux/django-oidc-op/blob/master/oidc_op/db_interfaces.py

in oidcendpoint we have a storage system based on an in memory strategy. It would works fine with a nosql engine like redis or mongodb, but it is a disaster with a rdbms.

If I'll have some more time I'll do the same for pysaml2 and satosa, take a look and tell us what you think.

peppelinux avatar Dec 16 '19 00:12 peppelinux

How does this code handle the following situation?

  • Request key A
  • Key A returns a Dict
  • Mutate Dict
  • (don't actively update Key A)

Does key A get updated in the Cache? I ask because this pattern is used in pysaml. My DjangoCache implementation gets around it by tying into the pysaml2-proprietary sync call.

My approach also ties directly into Django's caching infrastructure so you should be able to choose any Cache implementation. DB is one option provided by Django, but isn't hard-coded into my approach.

ambsw-technology avatar Dec 16 '19 20:12 ambsw-technology

Probably this could sounds better for you: https://github.com/knaperek/djangosaml2/blob/master/djangosaml2/cache.py

Bytheway I'm wondering about how could be simple to implement an abstraction layer to manage many kind of storage type. That's all

peppelinux avatar Dec 16 '19 20:12 peppelinux

@c00kiemon5ter is there something that you would consider for a future refactor or should we consider this as a "wont fix" kind?

@ambsw-technology did you find some tip in djangosaml2 or do you have some proposal to build a PR on this?

peppelinux avatar Sep 05 '20 22:09 peppelinux

@ambsw-technology probably that's the answer, but @ivan could help a lot in this if possible. https://github.com/IdentityPython/pysaml2/issues/522#issuecomment-687674080

peppelinux avatar Sep 05 '20 23:09 peppelinux

My DjangoCache wrapper is distributed under the same license so you can incorporate it in this package if you want. As I mentioned, the better solution is to change the internal cache implementation to use a more typical get(k) and set(k, v) interface. However, this package would also need to change the way it uses the cache since it gets and mutates a dict without explicitly using set again (which would be required in most cache implementations).

ambsw-technology avatar Sep 08 '20 15:09 ambsw-technology

Completely agree, this Is the PoC that introduced this behaviour in jwtconnect.io stack (oidc)

https://github.com/peppelinux/pyAbstractStorage

As you can see that's what you say

peppelinux avatar Sep 08 '20 15:09 peppelinux

A dict like interface would be nice.

spaceone avatar Nov 10 '20 19:11 spaceone