Azure CLI task fails: AADSTS700024: Client assertion is not within its valid time range
Acquiring access token with expired OIDC token fails with:
ERROR: AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-04-05T23:01:54.2089203Z, assertion valid from 2024-04-05T22:40:41.0000000Z, expiry time of assertion 2024-04-05T22:50:41.0000000Z. Review the documentation at https://docs.microsoft.com/azure/active-directory/develop/active-directory-certificate-credentials
As the error indicates, the OIDC token is only valid for 10 minutes. After it is passed to az login via --federated-token, Azure CLI cannot get a new OIDC token after the OIDC token expires.
This is the designed v1 behavior of OIDC token support (#19853).
However, as Azure DevOps task AzureCLI@2 (https://github.com/microsoft/azure-pipelines-tasks/pull/17633) and GitHub Action azure/login@v2 (https://github.com/Azure/login/pull/147) have supported OIDC token authentication, and it is recommended to use workload identity federation, this limitation is becoming more prevailing.
Possible solutions
- OIDC token provider such as Azure DevOps or GitHub should provide an option to control the expiry time of the OIDC token to make it at least as long as the task duration.
- Design and implement a v2 solution that uses a managed-identity-like interface which allows MSAL/Azure CLI to refresh OIDC token.
References
- https://github.com/Azure/login/issues/180
- https://github.com/Azure/login/issues/372
- IcM 490937309
- IcM 491234676
- Email: Workload identity federation in Azure Pipelines fails with AADSTS700024
refresh OIDC token is a feature
Callback interface proposals
Different external identity providers (IdP) have different ways of retrieving the ID token:
- GitHub Action exposes environment variable
ACTIONS_ID_TOKEN_REQUEST_URLandACTIONS_ID_TOKEN_REQUEST_TOKENand requires aGETHTTP request: https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-cloud-providers#requesting-the-jwt-using-environment-variables - Azure DevOps exposes Oidctoken - Create API and requires a
POSTHTTP request: https://learn.microsoft.com/en-us/rest/api/azure/devops/distributedtask/oidctoken/create
I had a discussion with MSAL team today and proposed 2 possible callback interfaces:
- Let each external IdP expose a callback command such as
getidtokenthat returns an ID token instdout, then instead of providing--federated-token <ID token>toaz login, they should provide--federated-token-callback getidtokentoaz login, so that CLI and MSAL can actively retrieve an ID token withgetidtokenwhen ID token expires. This is very similar to how Azure Identity'sAzureCliCredentialretrieves access tokens from Azure CLI by subprocessingaz account get-access-token. - Like the GitHub Action solution, define a manage-identity-like URL that can be used to get an ID token, such as
ID_TOKEN_REQUEST_URL.
Mitigation: Extend task duration to 60 minutes
[!WARNING] This mitigation doesn't work with Azure CLI 2.59.0. See https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226.
ID token: |----| 10 min
Access token 1: |------------------------| 60 min
Access token 2: | 20 min: ERROR: ID token expired
An ID token lasts for 5 minutes on GitHub Actions and 10 minutes on Azure DevOps, but an access token lasts for 60 minutes.
When you run az login, Azure CLI only acquires access tokens for ARM, using https://management.core.windows.net//.default as the scope.
After the ID token expires, if acquiring an access token for other scopes, such as
az account get-access-token --scope https://kusto.kusto.windows.net//.default
as currently there is no access token for that scope in the token cache, Azure CLI/MSAL will try to get an access token with the ID token. However, as the ID token has expired, the command fails with AADSTS700024.
So, the mitigation is pretty straightforward: Acquire all access tokens before the ID token expires.
You have to know which scopes are used in your pipeline task and call az account get-access-token --scope ... immediately after az login. This makes Azure CLI/MSAL acquire access tokens for the specified scopes while the ID token is still valid and save them in the token cache.
For example:
- Storage:
az account get-access-token --scope https://storage.azure.com/.default --output none - Key Vault:
az account get-access-token --scope https://vault.azure.net/.default --output none - Microsoft Graph:
az account get-access-token --scope https://graph.microsoft.com//.default --output none - Kusto:
az account get-access-token --scope https://kusto.kusto.windows.net//.default --output none
[!WARNING] Even though GitHub Actions can mask the access token as
***inaz account get-access-token's output:+ az account get-access-token *** "accessToken": "***", "expiresOn": "2024-04-10 14:11:25.000000", "expires_on": 1712758285, "subscription": "...", "tenant": "...", "tokenType": "Bearer" ***You MUST specify
--output noneto make sure no access token is printed to any of your logs.
Then subsequence commands using these scopes will use the access tokens saved in the token cache, so that they won't fail after the ID token expires, but they will still fail after the access token expires (60 minutes).
I tried fixing the issue with provided mitigation but it is still persistent, maybe I'm doing something wrong? My workflow contains actions which use NodeJS tests in which I verify connections to ServiceBus. As OIDC is used I login to azure with azure/login@v2 action:
- name: Azure login
uses: azure/login@v2
with:
client-id: ${{ env.AZURE_CLIENT_ID }}
tenant-id: ${{ env.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}
enable-AzPSSession: false
After that I added step to mitigate the issue:
- name: Azure get token
uses: azure/cli@v2
with:
inlineScript: |
az account get-access-token --scope https://storage.azure.com/.default --output none
az account get-access-token --scope https://servicebus.azure.net/.default
But after ~10 minutes Im still getting:
AggregateAuthenticationError: ChainedTokenCredential authentication failed.
CredentialUnavailableError: Please run 'az login' from a command prompt to authenticate before using this credential.
CredentialUnavailableError: WorkloadIdentityCredential: is unavailable. tenantId, clientId, and federatedTokenFilePath are required parameters.
In DefaultAzureCredential and ManagedIdentityCredential, these can be provided as environment variables -
"AZURE_TENANT_ID",
"AZURE_CLIENT_ID",
"AZURE_FEDERATED_TOKEN_FILE". See the troubleshooting guide for more information: https://aka.ms/azsdk/js/identity/workloadidentitycredential/troubleshoot
Did I miss something? I use https://www.npmjs.com/package/@azure/service-bus
Thanks for the mitigation @jiasli.
However, I don't think I'm hitting the issue where the Azure CLI tries to acquire an access token for a difference audience after the ID token has expired.
I'm fairly confident that the az commands I use only use the access token for ARM:
-
az account set -
az deployment sub create -
az deployment sub show -
az webapp deployment slot swap -
az webapp deployment source config-zip -
az webapp start -
az webapp stop
The general flow is:
- Deploy an ARM template
- Deploy binaries to an App Service staging slot
- Swap slots
- Stop App Service staging slot
The time it takes to swap slots varies greatly, however more than 5 minutes have always elapsed by the time it's done.
Now, what is strange is that stopping the slot sometimes work, and sometimes doesn't, dependending on how much time has passed since we ran azure/login.
To me, it sounds like the access token expires "quicker" than before. Could that be?
Edit: I checked across many workflow runs, and to me it looks like the access token expires after 10 minutes.
@Kapsztajn, I can successfully get an access token for https://servicebus.azure.net/.default locally which lasts for 4600s.
> az account get-access-token --scope https://servicebus.azure.net/.default
{
"accessToken": "...",
"expiresOn": "2024-04-11 13:57:35.000000",
"expires_on": 1712815055,
"subscription": "0b1f6471-1bf0-4dda-aec3-cb9272f09590",
"tenant": "54826b22-38d6-4fb2-bad9-b7b93a3e9c5a",
"tokenType": "Bearer"
}
Decoded claims:
"iat": 1712810455,
"nbf": 1712810455,
"exp": 1712815055,
I am not entirely sure why this line is printed:
CredentialUnavailableError: Please run 'az login' from a command prompt to authenticate before using this credential.
The Azure Service Bus client library for JavaScript SDK also didn't fail with AADSTS700024. I am not an expert of that SDK. Is it possible to collect more details on which scope the SDK requests, and why it fails with that error?
@mderriey, this seems odd as all these operations are indeed ARM operations. Could you check the actual expiration time of the access token issued for ARM?
> az account get-access-token --scope https://management.core.windows.net//.default --query expiresOn --output tsv
2024-04-11 13:47:47.000000
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins
run: |
while true; do
token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN
token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL
token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none
# Sleep for 4 minutes
sleep 240
done &
Could you try this out and see if this works for you as well?
@mderriey, this seems odd as all these operations are indeed ARM operations. Could you check the actual expiration time of the access token issued for ARM?
> az account get-access-token --scope https://management.core.windows.net//.default --query expiresOn --output tsv 2024-04-11 13:47:47.000000
Good suggestion @jiasli , thanks.
Here's what I ran:
steps:
- name: Login to Azure
uses: azure/login@v2
with:
client-id: ${{ env.oidcAppRegistrationClientId }}
tenant-id: ${{ env.azureTenantId }}
allow-no-subscriptions: true
enable-AzPSSession: true
- name: Check token expiry
shell: bash
run: |
echo "Current date: $(date '+%Y-%m-%dT%H:%M:%S')"
echo "Token expiration: $(az account get-access-token --resource-type arm --query expiresOn --output tsv --debug)"
echo "Token AzureAD/microsoft-authentication-library-for-python#2 expiration: $(az account get-access-token --resource-type arm --query expiresOn --output tsv --debug)"
And the output (debug output omitted):
Current date: 2024-04-11T06:57:14
Token expiration: 2024-04-11 07:57:14.000000
Token AzureAD/microsoft-authentication-library-for-python#2 expiration: 2024-04-11 07:57:14.000000
So the token is valid for 1 hour.
And both calls to az account get-access-token show this in the debug output, which I think confirms that the ARM token is cached and was originally acquired during az login:
DEBUG: msal.token_cache: event={
"client_id": "***",
"data": {
"claims": "{\"access_token\": {\"xms_cc\": {\"values\": [\"CP1\"]}}}",
"scope": [
"https://management.core.windows.net//.default"
]
},
"environment": "login.microsoftonline.com",
"grant_type": "client_credentials",
"params": null,
"response": {
"access_token": "********",
"expires_in": 3599,
"ext_expires_in": 3599,
"token_type": "Bearer"
},
"scope": [
"https://management.core.windows.net//.default"
],
"token_endpoint": "https://login.microsoftonline.com/<redacted>/oauth2/v2.0/token"
}
I'm not sure what happens, then...
I'll try removing the extra azure/login steps when I get some more time to see if the issue disappears.
Thanks again, let me know if I can perform some more testing if anything comes to mind. If you'd be interested in the debug output, I could send that privately.
Apologize for the confusion caused.
As I tested today, the mitigation I provided in https://github.com/Azure/azure-cli/issues/28708#issuecomment-2047256166 stopped working for Azure CLI 2.59.0, because of an MSAL regression introduced in 1.27.0 (https://github.com/AzureAD/microsoft-authentication-extensions-for-python/issues/127, https://github.com/AzureAD/microsoft-authentication-library-for-python/pull/644) which is adopted by Azure CLI 2.59.0 (https://github.com/Azure/azure-cli/pull/28556).
This regression makes MSAL's ConfidentialClientApplication bypass msal_extensions.token_cache.PersistedTokenCache, so access tokens are no longer retrieved from the token cache. Instead, every command now retrieves a new access token from the AAD Security Token Service (STS). In fact, not only the mitigation doesn't work, but even ARM commands fail with AADSTS700024 after the ID token expires.
I will work with MSAL on this issue with high priority.
Workaround
For now, please keep using service principal secret for authentication to get unblocked: https://github.com/marketplace/actions/azure-login#login-with-a-service-principal-secret
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@iamrk04 It looks like your solution is working and I managed to run test normally (pipeline did run over 16 minutes). I have added code which you provide between Azure login and component test:
- name: Azure login
uses: azure/login@v2
with:
client-id: ${{ env.AZURE_CLIENT_ID }}
tenant-id: ${{ env.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}
enable-AzPSSession: false
- name: Fetch OID token every 4 mins
shell: bash
run: |
while true; do
token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN
token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL
token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ env.AZURE_CLIENT_ID }} -t ${{ env.AZURE_TENANT_ID }} --federated-token $token --output none
# Sleep for 4 minutes
sleep 240
done &
- name: 'Run tests'
shell: bash
...
I had to add shell: bash because without it I got errors with missing shell.
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@smokedlinq, In my case, it's due to a new version of the GitHub hosted runner image for ubuntu-latest that was released which has Azure CLI 2.59.0 instead of 2.58.0 for the previous image.
The image went from 20240324.2.0 to 20240407.1.0.
You can see which image your run uses in the "Set up job" step at the very top.
@mderriey I assumed something like that, I was more referring to how that broke inside of az.
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins run: | while true; do token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none # Sleep for 4 minutes sleep 240 done &Could you try this out and see if this works for you as well?
Hey @iamrk04, you're a hero! I inserted this snippet into my workflow, and this made it all work. Great idea to just have that run in the background in a shell loop.
For reference: https://github.com/microsoft/hi-ml/pull/925/
Hi @Kapsztajn, the suggested mitigation did not work for me as well. It was able to fetch the token with an expiry that was reasonable, but I was able to see the same error once the OID token expired after 5 mins.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry. I was able to get this working and here is what I did: I inserted the following step in my workflow just before the step where this token expiry issue was popping:
- name: Fetch OID token every 4 mins run: | while true; do token_request=$ACTIONS_ID_TOKEN_REQUEST_TOKEN token_uri=$ACTIONS_ID_TOKEN_REQUEST_URL token=$(curl -H "Authorization: bearer $token_request" "${token_uri}&audience=api://AzureADTokenExchange" | jq .value -r) az login --service-principal -u ${{ secrets.CLIENT_ID }} -t ${{ secrets.TENANT_ID }} --federated-token $token --output none # Sleep for 4 minutes sleep 240 done &Could you try this out and see if this works for you as well?
Thanks @iamrk04 , this worked for me as well.
Suggestion from @iamrk04 also worked for me. Wrapped it in a github action that potentially can replace azure/login. I think the solution will even remove the 1 hour limit we had before but have not tested this yet.
name: Azure Federated Login
inputs:
client-id:
description: Azure client id
type: string
tenant-id:
description: Azure tenant id
type: string
subscription-id:
description: Azure subscription id
type: string
default: none
refresh-interval-seconds:
description: Refresh interval in seconds
type: number
default: 240
runs:
using: "composite"
steps:
- name: Fetch OID token every ${{ inputs.refresh-interval-seconds }} seconds
shell: bash
run: |
first_time=true
while true; do
token=$(curl -s -H "Authorization: bearer ${ACTIONS_ID_TOKEN_REQUEST_TOKEN}" "${ACTIONS_ID_TOKEN_REQUEST_URL}&audience=api://AzureADTokenExchange" | jq .value -r)
az login --service-principal -u ${{ inputs.client-id }} -t ${{ inputs.tenant-id }} --federated-token $token --output none
if [ "$first_time" = true ] && [ "${{ inputs.subscription-id }}" != "none" ]; then
az account set -s ${{ inputs.subscription-id }}
first_time=false
fi
sleep ${{ inputs.refresh-interval-seconds }}
done &
I'm running into the same issue in Azure Devops for a pipeline that runs a long python script (2h40m) in an AzureCLI@2 task. Was working fine on Friday (April 5th) but started failing after that with error:
AzureCliCredential: ERROR: AADSTS700024: Client assertion is not within its valid time range. ...
Any ideas on whether an equivalent workaround is possible for Azure Devops to refresh the token every 9 minutes?
We started having problems with the v2.59.0 az cli and rolled back as a workaround. I'm not sure what about the cli release makes this more/less likely to hit this.
My question is why this has popped up as an issue recently. We've had pipelines run for well over 20 minutes before and never seen this. But within the last week, it seems any workflow using Azure CLI with OIDC federated auth is experiencing this issue.
@smokedlinq, please refer to my comment https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226.
I propose a workaround by fetching the OID token every 4 mins to avoid the expiry.
This workaround https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049014471 proposed by @iamrk04 of periodically calling az login is not recommended, as Azure CLI doesn't support concurrent execution and you will very likely run into some racing condition (https://github.com/Azure/azure-cli/issues/9427, https://github.com/Azure/azure-cli/issues/20273).
We started having problems with the v2.59.0 az cli and rolled back as a workaround.
This workaround https://github.com/Azure/azure-cli/issues/28708#issuecomment-2050804548 proposed by @dghubble of using an old version is a correct one.
As I suggested in https://github.com/Azure/azure-cli/issues/28708#issuecomment-2049400226, using service principal secret for authentication is also another acceptable workaround.
@jiasli Service principals are unacceptable for some of us as our security certification would require we rotate them on a regular basis. OIDC does not add that additional burden given that they are clearly short lived.
Service principals are unacceptable for some of us as our security certification would require we rotate them on a regular basis. OIDC does not add that additional burden given that they are clearly short lived.
@andre-qumulo, we plan to fix the 5-minute expiration issue in the next version of Azure CLI which will be 2.60.0 and released on 2024-04-30. Using a service principal is only a temporary workaround. Secret rotation usually happens on a monthly basis which is far beyond the time we need to fix it.
I have created a separate issue to track it:
- https://github.com/Azure/azure-cli/issues/28737
I'm running into the same issue in Azure Devops for a pipeline that runs a long python script (2h40m) in an AzureCLI@2 task. Was working fine on Friday (April 5th) but started failing after that with error:
AzureCliCredential: ERROR: AADSTS700024: Client assertion is not within its valid time range. ...Any ideas on whether an equivalent workaround is possible for Azure Devops to refresh the token every 9 minutes?
Thanks @jiasli! The mitigation steps for Azure DevOps provided here of using a service principal secret were effective.
(I ran into some trouble finding the organization id while following the instructions but was able to find the organization id with these steps: https://medium.com/@shivapatel1102001/get-list-of-organization-from-azure-devops-microsoft-account-861ea29dae93)
@TomWildenhain, based on my understanding, the steps provided by https://learn.microsoft.com/en-us/azure/devops/pipelines/library/connect-to-azure?view=azure-devops don't require organization ID when creating a service connection using service principal secret. Could you let me know which article you are following?
@jiasli Org id is a 1P policy.
@jiasli Thanks for your help. I was following the instructions in a banner at the top of ADO after creating the manual service connection. The banner states:
Manually created service connections use an App Registration that was created by the user. Please add a federated credential to the App Registration with the following details: Issuer: https://vstoken.dev.azure.com/<org id>, Subject identifier: sc://<org>/<project>/<sc name>. Learn more
With a link to: https://learn.microsoft.com/en-us/azure/devops/pipelines/release/configure-workload-identity?view=azure-devops
I used the instructions to call the API here to get the org id: https://medium.com/@shivapatel1102001/get-list-of-organization-from-azure-devops-microsoft-account-861ea29dae93
@TomWildenhain, thanks for the information. If you used service principal secret to create the service connection, I don't think the federated identity credential added to the app is actually used.
@jiasli Is it possible to give any realistic timeline for a fix? I am wondering if it makes sense to ask for a rollback of the cli version contained in actions/runner-images that is used by both Github Actions and Azure DevOps.
We are seeing the same issue related to moving away from service principal secrets.
We are looking into adding logic for all Az CLI calls using the ARM token to ensure it gets refreshed (but not as a background process) to get the OIDC token from idToken and reuse it to log in via az account clear && az login ...