dstack icon indicating copy to clipboard operation
dstack copied to clipboard

Do not fail if user-specified Docker image is non-root

Open jvstme opened this issue 1 year ago • 1 comments

Steps to reproduce

Try running a configuration with a non-root image.

> cat prometheus.dstack.yml 
type: task

image: bitnami/prometheus
ports:
  - 9090

resources:
  memory: 0.5GB..
  cpu: 1..
> dstack run . -f prometheus.dstack.yml

Actual behaviour

The run fails. CLI:

 Configuration          prometheus.dstack.yml 
 Project                main                  
 User                   admin                 
 Pool name              default-pool          
 Min resources          1..xCPU, 0.5GB..      
 Max price              -                     
 Max duration           72h                   
 Spot policy            auto                  
 Retry policy           no                    
 Creation policy        reuse-or-create       
 Termination policy     destroy-after-idle    
 Termination idle time  300s                  

 #  BACKEND  REGION          INSTANCE  RESOURCES                 SPOT  PRICE     
 1  aws      us-west-2       t2.small  1xCPU, 2GB, 100GB (disk)  yes   $0.004    
 2  aws      ap-southeast-1  t2.small  1xCPU, 2GB, 100GB (disk)  yes   $0.0062   
 3  aws      eu-central-1    t2.small  1xCPU, 2GB, 100GB (disk)  yes   $0.0068   
    ...                                                                          
 Shown 3 of 761 offers, $49.159 max

Continue? [y/n]: y
spotty-monkey-1 provisioning completed (failed)
Run failed with error code JobTerminationReason.INTERRUPTED_BY_NO_CAPACITY. Check CLI and server logs for more 
details.

Server logs:

ERROR 2024-04-04T11:41:36.084 dstack._internal.server.background.tasks.process_running_jobs The docker container of the job 'spotty-monkey-1-0-0' is not working: exit code: 127, error 
DEBUG 2024-04-04T11:41:36.085 dstack._internal.server.background.tasks.process_running_jobs runner healthcheck: {'state': 'pending', 'container_name': 'spotty-monkey-1-0-0', 'status': 'exited', 'running': False, 'oom_killed': False, 'dead': False, 'exit_code': 127, 'error': ''}

shim.log on the cloud instance:

Reading package lists...
E: List directory /var/lib/apt/lists/partial is missing. - Acquire (2: No such file or directory)
/bin/sh: 1: yum: not found

Expected behaviour

The configuration runs successfully.

dstack version

0.17.0

Server logs

No response

Additional information

The main error here is E: List directory /var/lib/apt/lists/partial is missing. - Acquire (2: No such file or directory). It happens because the bitnami/prometheus image is non-root. See https://stackoverflow.com/a/57930100 and https://docs.bitnami.com/tutorials/work-with-non-root-containers/

jvstme avatar Apr 04 '24 10:04 jvstme

This issue is stale because it has been open for 30 days with no activity.

peterschmidt85 avatar May 05 '24 01:05 peterschmidt85

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.

peterschmidt85 avatar May 19 '24 01:05 peterschmidt85

Still relevant

jvstme avatar May 20 '24 11:05 jvstme

This issue is stale because it has been open for 30 days with no activity.

peterschmidt85 avatar Jun 20 '24 01:06 peterschmidt85

This issue was closed because it has been inactive for 14 days since being marked as stale. Please reopen the issue if it is still relevant.

peterschmidt85 avatar Jul 04 '24 01:07 peterschmidt85