lua-resty-openidc icon indicating copy to clipboard operation
lua-resty-openidc copied to clipboard

Intermittent Azure AD discovery failures during cold start with lua-resty-openidc

Open TickettEnterprises opened this issue 1 year ago • 1 comments

I have the following production environment setup:

  • openresty:alpine-fat docker image deployed to AWS App Runner
  • Nginx files using lua-resty-openidc.

When hitting the URL to the server, there is a cold start if the docker image hasn't been running for a while. I sometimes get a accessing discovery url (https://login.microsoftonline.com/organizations/v2.0/.well-known/openid-configuration) failed: network unreachable. After few seconds, I can refresh the page and the error is gone. I also see the same error when using the docker image in Docker Desktop.

To get around this error, I've created a retry wrapper, passing in a delay and retry.

function _M.authenticate_with_retry(opts, max_retries, retry_delay)
  local res, err
  local attempts = 0
  max_retries = tonumber(max_retries)
  retry_delay = tonumber(retry_delay)

  while attempts < max_retries do
    res, err = require("resty.openidc").authenticate(opts)
    if res then
      return res
    end
    attempts = attempts + 1
    if attempts < max_retries then
      ngx.log(ngx.NOTICE, "Authentication failed, attempt ", attempts, " of ", max_retries, ". Retrying in ", retry_delay, " seconds.")
      ngx.sleep(retry_delay)
    end
  end
  ngx.log(ngx.ERR, "Authentication failed after ", max_retries, " attempts: ", err)
  return nil, err
end
`res, err = _M.authenticate_with_retry(opts, max_retries, retry_delay)`

Is there any built in functionality that already handles this? Or is this something we should raise a pull request to implement?

Thanks

TickettEnterprises avatar Apr 10 '25 11:04 TickettEnterprises

I faced the same intermittent network unreachable problem in a K8s setup.

For me the issue was that IPv6 is not supported, but the DNS will sometimes return IPv6 addresses, leading to the described failure. (See https://github.com/zmartzone/lua-resty-openidc/issues/149#issuecomment-817268261)

Note the IPv6 address in the log:

2025/09/11 10:00:57 [error] 7#7: *15 connect() to [2603:1027:1:28::c]:443 failed (101: Network unreachable)

Once it happens to get an IPv4 address, the lookup succeeds and is cached, hence why the problem is most easily observed on a cold start.

I added ipv6=off to my resolver directive:

resolver ipv6=off local=on;

klaegera avatar Sep 11 '25 10:09 klaegera