cloud-native-stack icon indicating copy to clipboard operation
cloud-native-stack copied to clipboard

Remote Ansible Installs Fail

Open BHSDuncan opened this issue 1 year ago • 2 comments

The k8s-install.yaml file used in playbooks/ makes use of ansible_user_dir, but when doing a remote install, unless the username used for the remote is the same on the localhost (i.e. is already a user on the local system from which the playbook is kicked off), the playbook will fail with a permissions issue around line 276.

I verified this by creating a user on a local machine that kicks off the playbook (in fact, just a user dir in /home/ with the appropriate name and setting r/w permissions temporarily). Uninstalling and then re-running the playbook then works.

It looks like the change was done last year sometime: https://github.com/NVIDIA/cloud-native-stack/commit/55bdd53be90398b23dff1da95681d6afc88bab44

BHSDuncan avatar Apr 01 '24 17:04 BHSDuncan

@BHSDuncan we understand your concern, we used to have ubuntu/nvidia user all the time because of that we didn't this issue. I will verify with different user and push the fix and let you know.

The reason being changed from /tmp to ansible_user_dir is we saw some issues when we try to use /tmp like permissions.

angudadev avatar Apr 01 '24 17:04 angudadev

@BHSDuncan I have changed to /tmp and tested with different user on remote machines. works for me, please pull the latest and let us know if you still hit any issues.

Thanks

angudadev avatar Apr 02 '24 15:04 angudadev

@BHSDuncan closing this issue as it's fixed

angudadev avatar May 15 '24 18:05 angudadev