cml.dev icon indicating copy to clipboard operation
cml.dev copied to clipboard

Do we need different credential for creating runner and access GCS

Open mehadi92 opened this issue 3 years ago • 3 comments

I have the following GitHub action file,

jobs:
  deploy-runner:
    runs-on: ubuntu-20.04
    steps:
      - uses: iterative/setup-cml@v1
      - uses: actions/checkout@v2
      - name: Deploy runner on GCP
        env:
          REPO_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
          GOOGLE_APPLICATION_CREDENTIALS_DATA: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
        run: |
          cml runner \
              --cloud=gcp \
              --cloud-region=us-central1-a \
              --cloud-type=n1-highmem-2 \
              --labels=gpu_runner \
              --single=true \
              --cloud-gpu=nvidia-tesla-t4 \
              --cloud-spot=false \
              --cloud-hdd-size=128

  training:
    name: Training and Reporting
    needs:
      - deploy-runner
    runs-on: [self-hosted, gpu_runner]
    steps:
      - uses: actions/checkout@v2
      - uses: iterative/setup-cml@v1
      - uses: iterative/setup-dvc@v1
      - uses: actions/setup-python@v2
        with:
          python-version: '3.10'
      - name: Training
        env:
          repo_token: ${{ secrets.REPO_ACCESS_TOKEN }}
          GOOGLE_APPLICATION_CREDENTIALS_DATA: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
        run: |
          # Pull dataset with DVC
          dvc pull data

          # Reproduce pipeline if any changes detected in dependencies
          dvc repro

The deploy-runner: job working without any issue but training job fail in dvc pull data and it's give the following error

_request out of retries on exception: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb\'<!DOCTYPE html>\\n<html lang=en>\\n  <meta charset=utf-8>\\n  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\\n  <title>Error 404 (Not Found)!!1</title>\\n  <style>\\n    ****margin:0;padding:0***html,code***font:15px/22px arial,sans-serif***html***background:#fff;color:#222;padding:15px***body***margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px**** > body***background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px***p***margin:11px 0 22px;overflow:hidden***ins***color:#777;text-decoration:none***a img***border:0***@media screen and (max-width:772px)***body***background:none;margin-top:0;max-width:none;padding-right:0***#logo***background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px***@media only screen and (min-resolution:192dpi)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0***@media only screen and (-webkit-min-device-pixel-ratio:2)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%***#logo***display:inline-block;height:54px;width:150px***\\n  </style>\\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n  <p><b>404.</b> <ins>That\\xe2\\x80\\x99s an error.</ins>\\n  <p>The requested URL <code>/computeMetadata/v1/instance/service-accounts/default/?recursive=true</code> was not found on this server.  <ins>That\\xe2\\x80\\x99s all we know.</ins>\\n\'', <google.auth.transport.requests._Response object at 0x7f154db67c70>)
Traceback (most recent call last):
  File "google/auth/compute_engine/credentials.py", line 111, in refresh
  File "google/auth/compute_engine/credentials.py", line 87, in _retrieve_info
  File "google/auth/compute_engine/_metadata.py", line 234, in get_service_account_info
  File "google/auth/compute_engine/_metadata.py", line 182, in get
google.auth.exceptions.TransportError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb\'<!DOCTYPE html>\\n<html lang=en>\\n  <meta charset=utf-8>\\n  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\\n  <title>Error 404 (Not Found)!!1</title>\\n  <style>\\n    ****margin:0;padding:0***html,code***font:15px/22px arial,sans-serif***html***background:#fff;color:#222;padding:15px***body***margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px**** > body***background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px***p***margin:11px 0 22px;overflow:hidden***ins***color:#777;text-decoration:none***a img***border:0***@media screen and (max-width:772px)***body***background:none;margin-top:0;max-width:none;padding-right:0***#logo***background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px***@media only screen and (min-resolution:192dpi)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0***@media only screen and (-webkit-min-device-pixel-ratio:2)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%***#logo***display:inline-block;height:54px;width:150px***\\n  </style>\\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n  <p><b>404.</b> <ins>That\\xe2\\x80\\x99s an error.</ins>\\n  <p>The requested URL <code>/computeMetadata/v1/instance/service-accounts/default/?recursive=true</code> was not found on this server.  <ins>That\\xe2\\x80\\x99s all we know.</ins>\\n\'', <google.auth.transport.requests._Response object at 0x7f154db67c70>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "gcsfs/retry.py", line 115, in retry_request
  File "gcsfs/core.py", line 374, in _request
  File "gcsfs/core.py", line 353, in _get_headers
  File "gcsfs/credentials.py", line 182, in apply
  File "gcsfs/credentials.py", line 177, in maybe_refresh
  File "google/auth/compute_engine/credentials.py", line 117, in refresh
  File "<string>", line 3, in raise_from
google.auth.exceptions.RefreshError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb\'<!DOCTYPE html>\\n<html lang=en>\\n  <meta charset=utf-8>\\n  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\\n  <title>Error 404 (Not Found)!!1</title>\\n  <style>\\n    ****margin:0;padding:0***html,code***font:15px/22px arial,sans-serif***html***background:#fff;color:#222;padding:15px***body***margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px**** > body***background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px***p***margin:11px 0 22px;overflow:hidden***ins***color:#777;text-decoration:none***a img***border:0***@media screen and (max-width:772px)***body***background:none;margin-top:0;max-width:none;padding-right:0***#logo***background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px***@media only screen and (min-resolution:192dpi)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0***@media only screen and (-webkit-min-device-pixel-ratio:2)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%***#logo***display:inline-block;height:54px;width:150px***\\n  </style>\\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n  <p><b>404.</b> <ins>That\\xe2\\x80\\x99s an error.</ins>\\n  <p>The requested URL <code>/computeMetadata/v1/instance/service-accounts/default/?recursive=true</code> was not found on this server.  <ins>That\\xe2\\x80\\x99s all we know.</ins>\\n\'', <google.auth.transport.requests._Response object at 0x7f154db67c70>)
ERROR: unexpected error - ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb\'<!DOCTYPE html>\\n<html lang=en>\\n  <meta charset=utf-8>\\n  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\\n  <title>Error 404 (Not Found)!!1</title>\\n  <style>\\n    ****margin:0;padding:0***html,code***font:15px/22px arial,sans-serif***html***background:#fff;color:#222;padding:15px***body***margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px**** > body***background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px***p***margin:11px 0 22px;overflow:hidden***ins***color:#777;text-decoration:none***a img***border:0***@media screen and (max-width:772px)***body***background:none;margin-top:0;max-width:none;padding-right:0***#logo***background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px***@media only screen and (min-resolution:192dpi)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0***@media only screen and (-webkit-min-device-pixel-ratio:2)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%***#logo***display:inline-block;height:54px;width:150px***\\n  </style>\\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n  <p><b>404.</b> <ins>That\\xe2\\x80\\x99s an error.</ins>\\n  <p>The requested URL <code>/computeMetadata/v1/instance/service-accounts/default/?recursive=true</code> was not found on this server.  <ins>That\\xe2\\x80\\x99s all we know.</ins>\\n\'', <google.auth.transport.requests._Response object at 0x7f154db67c70>): ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb\'<!DOCTYPE html>\\n<html lang=en>\\n  <meta charset=utf-8>\\n  <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">\\n  <title>Error 404 (Not Found)!!1</title>\\n  <style>\\n    ****margin:0;padding:0***html,code***font:15px/22px arial,sans-serif***html***background:#fff;color:#222;padding:15px***body***margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px**** > body***background:url(//www.google.com/images/errors/robot.png) 100% 5px no-repeat;padding-right:205px***p***margin:11px 0 22px;overflow:hidden***ins***color:#777;text-decoration:none***a img***border:0***@media screen and (max-width:772px)***body***background:none;margin-top:0;max-width:none;padding-right:0***#logo***background:url(//www.google.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px***@media only screen and (min-resolution:192dpi)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0% 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0***@media only screen and (-webkit-min-device-pixel-ratio:2)***#logo***background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%***#logo***display:inline-block;height:54px;width:150px***\\n  </style>\\n  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>\\n  <p><b>404.</b> <ins>That\\xe2\\x80\\x99s an error.</ins>\\n  <p>The requested URL <code>/computeMetadata/v1/instance/service-accounts/default/?recursive=true</code> was not found on this server.  <ins>That\\xe2\\x80\\x99s all we know.</ins>\\n\'', <google.auth.transport.requests._Response object at 0x7f154db67c70>)

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

According to the documentation cloud-storage-provider-credentials the GCS environment variable is GOOGLE_APPLICATION_CREDENTIALS but i'm using GOOGLE_APPLICATION_CREDENTIALS_DATA is this the issue or i'm missing something

Thanks

mehadi92 avatar Sep 25 '22 11:09 mehadi92

You do need to do something different, the GOOGLE_APPLICATION_CREDENTIALS_DATA is not a "real" env that's commonly supported.

The fastest method would be the add the google auth action like so:

  training:
    name: Training and Reporting
    needs:
      - deploy-runner
    runs-on: [self-hosted, gpu_runner]
    steps:
      - uses: actions/checkout@v2
      - uses: iterative/setup-cml@v1
      - uses: iterative/setup-dvc@v1
      - uses: actions/setup-python@v2
        with:
          python-version: '3.10'
      - name: 'Authenticate to Google Cloud'
        uses: 'google-github-actions/auth@v0'
        with:
          credentials_json: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
      - name: Training
        env:
          repo_token: ${{ secrets.REPO_ACCESS_TOKEN }}
        run: |
          # Pull dataset with DVC
          dvc pull data

          # Reproduce pipeline if any changes detected in dependencies
          dvc repro

google-github-actions/auth@v0 will populate many env's so that any cli to which uses gcp should work.

alternatively you can assgin the cml runner created instance a service account like so:

jobs:
  deploy-runner:
    runs-on: ubuntu-20.04
    steps:
      - uses: iterative/setup-cml@v1
      - uses: actions/checkout@v2
      - name: 'Authenticate to Google Cloud'
        uses: 'google-github-actions/auth@v0'
        with:
          credentials_json: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS_DATA }}
      - name: Deploy runner on GCP
        env:
          REPO_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
        run: |
          cml runner \
              --cloud=gcp \
              --cloud-region=us-central1-a \
              --cloud-type=n1-highmem-2 \
              --labels=gpu_runner \
              --single=true \
              --cloud-gpu=nvidia-tesla-t4 \
              --cloud-spot=false \
              --cloud-hdd-size=128 \
              --cloud-permission-set=SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com,scopes=storage-rw

  training:
    name: Training and Reporting
    needs:
      - deploy-runner
    runs-on: [self-hosted, gpu_runner]
    steps:
      - uses: actions/checkout@v2
      - uses: iterative/setup-cml@v1
      - uses: iterative/setup-dvc@v1
      - uses: actions/setup-python@v2
        with:
          python-version: '3.10'
      - name: Training
        env:
          repo_token: ${{ secrets.REPO_ACCESS_TOKEN }}
        run: |
          # Pull dataset with DVC
          dvc pull data

          # Reproduce pipeline if any changes detected in dependencies
          dvc repro

dacbd avatar Sep 25 '22 15:09 dacbd

@dacbd Thanks it's solved the issue. And I think the documentation should be updated accordingly

mehadi92 avatar Sep 27 '22 03:09 mehadi92

@mehadi92 I will leave this open until we can a related docs issue or update.

dacbd avatar Sep 27 '22 14:09 dacbd