skipper mask users in tracing tags

mask users in tracing tags

Sep 24 '20 16:09 zeitlinger

Not sure whether masking of users in logs is necessary. Afaik, the privacy setup for logs should usually be sufficient to allow logging of user names.

Anyhow, why didn't you add the masking directly in the filter?

Sep 24 '20 19:09 tkrop

Not sure whether masking of users in logs is necessary. Afaik, the privacy setup for logs should usually be sufficient to allow logging of user names.

this PR is not about logs

Anyhow, why didn't you add the masking directly in the filter?

I did (as much as possible): https://github.com/zalando/skipper/pull/1524/files#diff-50793c74bbe4ccdd0d38b7b8296a0860R62

Sep 25 '20 07:09 zeitlinger

Please handle nil case and use this as default

yes, that's the idea

What about oidc* filters?

What about tokenintrospection filters?

I think the other filters should be done in the same PR, such that we have feature parity and not friction (not sure if these apply but IIRC yes).

yes, I'm still wondering how to do it - I didn't find a realm in https://tools.ietf.org/html/rfc7662#section-2.2. Can you help?

Sep 25 '20 07:09 zeitlinger

@zeitlinger realm is not in the spec because it is a Zalando extension, since we have users, services, as well as customers and may be (an)other realm(s) for partners. Checkout: ztoken | sed -e "s/$.*$\.$.*$\.$.*$/\2/" | base64 -d | jq '.' The actual key is: https://identity.zalando.com/realm.

Sep 25 '20 11:09 tkrop

@zeitlinger realm is not in the spec because it is a Zalando extension

makes sense

for tokeninfo it's curl -s -H "Authorization: Bearer $(ztoken)" -X POST https://info.services.auth.zalando.com/oauth2/tokeninfo and there the key is realm.

Sep 25 '20 12:09 zeitlinger

Thought about how to

move more logic into the filter
not hard code realm - as it's a Zalando extension
make masking work for the oidc, token inspection and webhook

This is my idea

store the entire auth map (with uid, realm, etc) in a new cache key ("auth-map")
rename mask-oauth-realm to mask-oauth-user with content e.g. employees:realm=employees which would look for realm = employees and replace it with employees (this could be a comma-separated list)

WDYT?

Sep 28 '20 11:09 zeitlinger

I think for oidc we already have stored everything in ctx.StateBag()[oidcClaimsCacheKey], tokenintrospection uses state bag key tokenintrospectionCacheKey and tokeninfo tokeninfoCacheKey. I think you won't need to do anything for webhook, because basically it works only on response code of the http request of the webhook call, so there is no data to log.

In case of implementation we could also add a filter parameter that decides on which key to use: filter("<tokeninfo|oidc|tokenintrospection>"). Do you think it's not possible like this or does it have a drawback?

Sep 28 '20 16:09 szuecs

I think for oidc we already have stored everything in ctx.StateBag()[oidcClaimsCacheKey], tokenintrospection uses state bag key tokenintrospectionCacheKey and tokeninfo tokeninfoCacheKey. I think you won't need to do anything for webhook, because basically it works only on response code of the http request of the webhook call, so there is no data to log.

yes, we can also let the filter read directly from those keys instead of adding a new one.

In case of implementation we could also add a filter parameter that decides on which key to use: filter("<tokeninfo|oidc|tokenintrospection>"). Do you think it's not possible like this or does it have a drawback?

That makes it harder to understand - hence easier to get wrong. So my preference would be to let the filter look for these 3 keys.

Sep 29 '20 11:09 zeitlinger

I think for oidc we already have stored everything in ctx.StateBag()[oidcClaimsCacheKey], tokenintrospection uses state bag key tokenintrospectionCacheKey and tokeninfo tokeninfoCacheKey. I think you won't need to do anything for webhook, because basically it works only on response code of the http request of the webhook call, so there is no data to log.

yes, we can also let the filter read directly from those keys instead of adding a new one.

In case of implementation we could also add a filter parameter that decides on which key to use: filter("<tokeninfo|oidc|tokenintrospection>"). Do you think it's not possible like this or does it have a drawback?

That makes it harder to understand - hence easier to get wrong. So my preference would be to let the filter look for these 3 keys.

sgtm

Sep 29 '20 17:09 szuecs

I think for oidc we already have stored everything in ctx.StateBag()[oidcClaimsCacheKey], tokenintrospection uses state bag key tokenintrospectionCacheKey and tokeninfo tokeninfoCacheKey. I think you won't need to do anything for webhook, because basically it works only on response code of the http request of the webhook call, so there is no data to log.

yes, we can also let the filter read directly from those keys instead of adding a new one.

In case of implementation we could also add a filter parameter that decides on which key to use: filter("<tokeninfo|oidc|tokenintrospection>"). Do you think it's not possible like this or does it have a drawback?

That makes it harder to understand - hence easier to get wrong. So my preference would be to let the filter look for these 3 keys.

sgtm

@szuecs changed accordingly - please review

Oct 01 '20 06:10 zeitlinger

I think this feature should not be a global flag, but rather a filter. Is there a reason or a limitation why it is a global setting?

Nov 16 '20 13:11 aryszka

I think this feature should not be a global flag, but rather a filter. Is there a reason or a limitation why it is a global setting?

yes: the intention is to make it impossible to leak the user name. If we would make it dependent on the filter configuration, then we would need to validate that all filters are configured in a way to mask users (which is harder to do)

Nov 16 '20 15:11 zeitlinger

I think this feature should not be a global flag, but rather a filter. Is there a reason or a limitation why it is a global setting?

yes: the intention is to make it impossible to leak the user name. If we would make it dependent on the filter configuration, then we would need to validate that all filters are configured in a way to mask users (which is harder to do)

We should not mistake our role as vendors for the one as users. As vendors, we should respect everybody's use case even if it's harder. The primary way of configuring Skipper's routing is using, well, the route configuration. To apply global routing constraints, we can rely on the default filters. See: https://opensource.zalando.com/skipper/operation/operation/#default-filters . Please, convert this functionality into a filter.

Nov 16 '20 21:11 aryszka

Please, convert this functionality into a filter.

IIUC, you're saying we could append a global default filter that has the same effect.

Would this filter modify what stateBagToTag is doing? Not sure I understand correctly.

Nov 17 '20 11:11 zeitlinger

Would this filter modify what stateBagToTag is doing?

the current proposal works more or less like this:

defines a set of masks globally, on startup
the stateBagToTag applies this mask when the statebagitem is the auth user

the first step could be replaced by a filter that defines the same mask, and if the user wants, they can even set this filter only for specific routes or with the default filters option for all routes.

And then comes the question that was already raised in an earlier comment. Why limit this functionality to the auth user? I see no reason for that, masking could be applied to any statebag key at the will of the user.

Nov 17 '20 12:11 aryszka

the first step could be replaced by a filter that defines the same mask, and if the user wants, they can even set this filter only for specific routes or with the default filters option for all routes.

that would also work - would we actually be fine to have such a filter in all routes (given that it consumes some memory)?

And then comes the question that was already raised in an earlier comment. Why limit this functionality to the auth user? I see no reason for that, masking could be applied to any statebag key at the will of the user.

The logic for extracting the realm (or any other property) is specific to the token caches - I don't know how to generalize it further.

Nov 18 '20 07:11 zeitlinger

that would also work - would we actually be fine to have such a filter in all routes (given that it consumes some memory)?

No that's not a problem. This filter would not be much different from the other filters.

The logic for extracting the realm (or any other property) is specific to the token caches - I don't know how to generalize it further.

Consider what it could do. It could apply a mask for values that are used by the stateBagToTag() filter. If we have maskStateBag(stateBagKey, mask) -> stateBagToTag(stateBagKey, tag), then maskStateBag could store a masked value with the key prefixed by 'masked:...", and then stateBagToTag could check whether a masked value exists for the same key, or just use the masked key, as well. Then we don't need to modify (and couple functionality) in the auth filters for (with) masking.

Dec 01 '20 17:12 aryszka

This PR is linked from the our Zalando-internal production readiness review template:

(can also be done using skipper (deployed - but don’t use it yet until this issue is resolved))

Is there any ETA on when this might be done?

Sep 20 '21 15:09 ePaul

@ePaul I think nobody is working on it.

Sep 20 '21 18:09 szuecs