bagit-python
bagit-python copied to clipboard
Avoid nested bags by default
In #186 I describe the unwanted creation of nested bags. This PR also closes the issue. Currently it is not transparent that a nested bag is created.
Because it may be used, I implemented a flag that still allows the creation of nested bags, but by default a RuntimeError will be raised.
Changes
- added a function
is_bag(bag_dir), which uses theBagconstructor to test whether a directory is already a bag. - add flag
allow_nested_bag=Falseto functionmake_bag - add logic to function
make_bagthat raises aRuntimeErrorif the givenbag_diris already a bag using the new functionis_bag - add test cases for the functions
is_bagandmake_bag
Tests
All test are running successfully with my changes. See the log for more information.
Details of output of test.py
❯ python test.py
/home/thea/git/bagit-python/bagit.py:1451: DeprecationWarning: 'count' is passed as positional argument
s = re.sub(r"%0D", "\r", s, re.IGNORECASE)
/home/thea/git/bagit-python/bagit.py:1452: DeprecationWarning: 'count' is passed as positional argument
s = re.sub(r"%0A", "\n", s, re.IGNORECASE)
.........../home/thea/git/bagit-python/bagit.py:165: DeprecationWarning: The `checksum` argument for `make_bag` should be replaced with `checksums`
warnings.warn(
...Disabling requested hash algorithm not-really-a-name: hashlib does not support it
An error occurred creating a bag in /tmp/tmp8450qsbp
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 260, in make_bag
total_bytes, total_files = make_manifests(
~~~~~~~~~~~~~~^
"data", processes, algorithms=checksums, encoding=encoding
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/thea/git/bagit-python/bagit.py", line 1275, in make_manifests
checksums = [manifest_line_generator(i) for i in _walk(data_dir)]
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/home/thea/git/bagit-python/bagit.py", line 1418, in generate_manifest_lines
hashers = get_hashers(algorithms)
File "/home/thea/git/bagit-python/bagit.py", line 1136, in get_hashers
raise ValueError(
...<3 lines>...
)
ValueError: Unable to continue: hashlib does not support any of the requested algorithms!
.Bag directory /home/thea/git/bagit-python/this-directory-does-not-exist does not exist
.....The following files do not have read permissions:
('/tmp/tmpnompusg1/loc/2478433644_2839c5e8b8_o_d.jpg',)
An error occurred creating a bag in /tmp/tmpnompusg1
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 229, in make_bag
raise BagError(
_("Read permissions are required to calculate file fixities")
)
bagit.BagError: Read permissions are required to calculate file fixities
.Unable to write to the following directories and files:
['/tmp/tmpgl1u4_go']
An error occurred creating a bag in /tmp/tmpgl1u4_go
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 213, in make_bag
raise BagError(
_("Missing permissions to move all files and directories"))
bagit.BagError: Missing permissions to move all files and directories
.The following directories do not have read permissions:
('/tmp/tmp4qzlyr7a/loc',)
An error occurred creating a bag in /tmp/tmp4qzlyr7a
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 229, in make_bag
raise BagError(
_("Read permissions are required to calculate file fixities")
)
bagit.BagError: Read permissions are required to calculate file fixities
.Unable to write to the following directories and files:
['/tmp/tmp6t3bs2_m', '/tmp/tmp6t3bs2_m/loc']
An error occurred creating a bag in /tmp/tmp6t3bs2_m
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 213, in make_bag
raise BagError(
_("Missing permissions to move all files and directories"))
bagit.BagError: Missing permissions to move all files and directories
..........The following files do not have read permissions:
('/tmp/tmpcutz17p7/bag-info.txt',)
..........Creating bag for directory /tmp/tmp65leir02
Creating data directory
Moving si to /tmp/tmp65leir02/tmpjdlclbxc/si
Moving loc to /tmp/tmp65leir02/tmpjdlclbxc/loc
Moving README to /tmp/tmp65leir02/tmpjdlclbxc/README
Moving /tmp/tmp65leir02/tmpjdlclbxc to data
Using 1 processes to generate manifests: sha256, sha512
Generating manifest lines for file data/README
Generating manifest lines for file data/loc/2478433644_2839c5e8b8_o_d.jpg
Generating manifest lines for file data/loc/3314493806_6f1db86d66_o_d.jpg
Generating manifest lines for file data/si/2584174182_ffd5c24905_b_d.jpg
Generating manifest lines for file data/si/4011399822_65987a4806_b_d.jpg
Creating bagit.txt
Creating bag-info.txt
Creating /tmp/tmp65leir02/tagmanifest-sha256.txt
Creating /tmp/tmp65leir02/tagmanifest-sha512.txt
..............................bag-info.txt defines multiple Payload-Oxum values!
...data/README exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
...data/README sha256 validation failed: expected="9006a02daf291a3ce8eebbb094ed3d17fcb0177b8e8d3421fbb8a080a2be48bf" found="d54d79889e20997c4b265488131fb593580f1885b3a5d75df49fe7f6604b66d0"
data/README sha512 validation failed: expected="06f3dedbd5c7796b75a7d5021aaf54559e0679c27b37d355f65ea64e31fd29a70b6e06e5c0b73fad809c579fb0f6fb7076ceec055c17a173e49007955c9f5820" found="c758e703c015e05a7e0631cb4f15ed5397c318e8ad56e1227ad2ce974d00c33642ec413172414545102708cb326176935e30e41c1f72733c894c2fb031477145"
..tmpk6fiecpp/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmpk6fiecpp/tagfile exists in manifest but was not found on filesystem
.tmp79jtp40e/tagfolder/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp79jtp40e/tagfolder/tagfile exists in manifest but was not found on filesystem
.Unable to calculate file hashes for /tmp/tmprxq331w5
Traceback (most recent call last):
File "/home/thea/git/bagit-python/bagit.py", line 916, in _validate_entries
pool = multiprocessing.Pool(
processes if processes else None, initializer=worker_init
)
File "/usr/lib/python3.13/unittest/mock.py", line 1169, in __call__
return self._mock_call(*args, **kwargs)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/unittest/mock.py", line 1173, in _mock_call
return self._execute_mock_call(*args, **kwargs)
~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/unittest/mock.py", line 1228, in _execute_mock_call
raise effect
RuntimeError
.bag-info.txt exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
.data/loc/2478433644_2839c5e8b8_o_d.jpg md5 validation failed: expected="9a2b89e9940fea6ac3a0cc71b0a933a0" found="Could not read /tmp/tmprxwtxkyt/data/loc/2478433644_2839c5e8b8_o_d.jpg: [Errno 13] Permission denied: '/tmp/tmprxwtxkyt/data/loc/2478433644_2839c5e8b8_o_d.jpg'"
.bag-info.txt exists in manifest but was not found on filesystem
data/README exists in manifest but was not found on filesystem
data/extra exists on filesystem but is not in the manifest
.data/README md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="fd41543285d17e7c29cd953f5cf5b955"
................bag-info.txt defines multiple Payload-Oxum values!
...data/README exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
...data/README sha256 validation failed: expected="9006a02daf291a3ce8eebbb094ed3d17fcb0177b8e8d3421fbb8a080a2be48bf" found="d54d79889e20997c4b265488131fb593580f1885b3a5d75df49fe7f6604b66d0"
data/README sha512 validation failed: expected="06f3dedbd5c7796b75a7d5021aaf54559e0679c27b37d355f65ea64e31fd29a70b6e06e5c0b73fad809c579fb0f6fb7076ceec055c17a173e49007955c9f5820" found="c758e703c015e05a7e0631cb4f15ed5397c318e8ad56e1227ad2ce974d00c33642ec413172414545102708cb326176935e30e41c1f72733c894c2fb031477145"
..tmp9s2ei8kh/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp9s2ei8kh/tagfile exists in manifest but was not found on filesystem
.tmp5na6jn06/tagfolder/tagfile md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="098f6bcd4621d373cade4e832627b4f6"
tmp5na6jn06/tagfolder/tagfile exists in manifest but was not found on filesystem
.bag-info.txt exists in manifest but was not found on filesystem
data/extra_file exists on filesystem but is not in the manifest
.data/loc/2478433644_2839c5e8b8_o_d.jpg md5 validation failed: expected="9a2b89e9940fea6ac3a0cc71b0a933a0" found="Could not read /tmp/tmpcmz8z7bq/data/loc/2478433644_2839c5e8b8_o_d.jpg: [Errno 13] Permission denied: '/tmp/tmpcmz8z7bq/data/loc/2478433644_2839c5e8b8_o_d.jpg'"
.bag-info.txt exists in manifest but was not found on filesystem
data/README exists in manifest but was not found on filesystem
data/extra exists on filesystem but is not in the manifest
.data/README md5 validation failed: expected="8e2af7a0143c7b8f4de0b3fc90f27354" found="fd41543285d17e7c29cd953f5cf5b955"
.
----------------------------------------------------------------------
Ran 117 tests in 1.151s
OK