rez-suite broken in 2.104.7- cannot add second context
The following fails for me:
generate two contexts from two different packages
rez env sgq_schema -o sgqschema.rxt rez env am_query -o amquery.rxt
create the suite
rez suite --create ./misc_suite
add the first context
rez suite --add ./sgqschema.rxt --context sqgschema
attempt to add the second context per the documentation
rez suite --add ./amquery.rxt --context amquery
Traceback (most recent call last):
File "/laika/depts/prod_tech/rez/lib/python/linux/rez/bin/rez/rez-suite", line 8, in <module>
sys.exit(run_rez_suite())
File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_entry_points.py", line 239, in run_rez_suite
return run("suite")
File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_main.py", line 191, in run
returncode = run_cmd()
File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_main.py", line 183, in run_cmd
return func(opts, opts.parser, extra_arg_groups)
File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/suite.py", line 200, in command
suite.save(opts.DIR)
File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/suite.py", line 442, in save
shutil.rmtree(path)
File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 270, in rmtree
rmtree(fullname, ignore_errors, onerror)
File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 279, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 277, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty: '/net/ent-prod.nfs.laika.com/ifs/laika/dist/rel/packages/rez-suites/linux/cli_utils/contexts'
At the point of failure, the bin directory and suite.yaml have both been removed. (why?)
I know that this used to work with previous versions.
That's odd, I can't repro this. What platform/os are you on? Are you writing to local disk? A
On Sat, Feb 19, 2022 at 6:33 AM jlgerber @.***> wrote:
The following fails for me: generate two contexts from two different packages
rez env sgq_schema -o sgqschema.rxt rez env am_query -o amquery.rxt create the suite
rez suite --create ./misc_suite add the first context
rez suite --add ./sgqschema.rxt --context sqgschema attempt to add the second context per the documentation
rez suite --add ./amquery.rxt --context amquery
Traceback (most recent call last): File "/laika/depts/prod_tech/rez/lib/python/linux/rez/bin/rez/rez-suite", line 8, in
sys.exit(run_rez_suite()) File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_entry_points.py", line 239, in run_rez_suite return run("suite") File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_main.py", line 191, in run returncode = run_cmd() File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/_main.py", line 183, in run_cmd return func(opts, opts.parser, extra_arg_groups) File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/cli/suite.py", line 200, in command suite.save(opts.DIR) File "/net/ent-prod.nfs.laika.com/ifs/laika/depts/prod_tech/rez/lib/python/linux/rez-2.104.7/lib/python2.7/site-packages/rez/suite.py", line 442, in save shutil.rmtree(path) File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 270, in rmtree rmtree(fullname, ignore_errors, onerror) File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 279, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/net/ent-prod.nfs.laika.com/ifs/laika/home/j/jgerber/packages/python/2.7.18/platform-linux/arch-x86_64/os-CentOS-7.7.1908/lib/python2.7/shutil.py", line 277, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: '/net/ent-prod.nfs.laika.com/ifs/laika/dist/rel/packages/rez-suites/linux/cli_utils/contexts' At the point of failure, the bin directory and suite.yaml have both been removed. (why?)
I know that this used to work with previous versions.
— Reply to this email directly, view it on GitHub https://github.com/nerdvegas/rez/issues/1222, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMOUSUAT6I42MDGD4DCG23U32NIBANCNFSM5OY36PGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
this is on Centos 7 and the location in question is mounted over NFS.
Interestingly enough, when I inspect the directory in question, it appears empty ( ls -a yields nothing).
I will give this a try on local disk to rule out issues around shared storage
This works locally. So i suppose it is related to NFS. Interestingly, this just started happening for us. This appears to be a known outcome when using shutil.rmtree on nfs mounted directories.... I wonder if it would be reasonable to add ignore_errors=True to the shutil.rmtree call in save....
one would also have to handle a subsequent call to os.makedirs in order to handle the case where the context directory was not deleted...
I patched our code to see if this approach fixed our issue and it did.
Ah righto, could you add some more info so I can follow this up and potentially fix in rez also? So you're saying there's a known issue with shutil.rmtree over nfs..? Do you know how that's specifically then manifesting in the second 'context add' failing?
Cheers A
On Thu, Feb 24, 2022 at 11:32 AM jlgerber @.***> wrote:
I patched our code to see if this approach fixed our issue and it did.
— Reply to this email directly, view it on GitHub https://github.com/nerdvegas/rez/issues/1222#issuecomment-1049363212, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMOUSQKOHTOWOUIZFNV4FDU4V4DNANCNFSM5OY36PGQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you commented.Message ID: @.***>
sure thing. I based the comment on a cursory google search of "shutil.rmtree nfs". I will instrument the call to see if I can determine what is triggering the failure. By the time the exception is thrown, the directory is in fact empty.
As I suspected, there is a file beginning with '.nfs' that exists at the time which shutil.rmtree is doing its thing. NFS uses these files for book keeping purposes. they are managed by the nfs client. So this appears to be a race condition of sorts.