bagit-python icon indicating copy to clipboard operation
bagit-python copied to clipboard

Missing Warning on Creating Bag of a Bag

Open theatischbein opened this issue 7 months ago • 3 comments

When creating a bag of bag I would expect a warning or confirmation if I really want to create a bag of a bag.

In general it should be possible to create a bag of a bag, but I would not consider it a default case.

Reproduction

bag = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'})

This creates a bag with all info, manifest files and the data directory

/tmp/test_bag
| - bag-info.txt
| - bagit.txt
| - data
| - manifest-sha256.txt
| - manifest-sha512.txt
| - tagmanifest-sha256.txt
| - tagmanifest-sha512.txt

If I run make_bag on the same folder again

bag_again = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea})

this creates a bag with all info, manifest files and the data directory of the first level bag

/tmp/test_bag
| - bag-info.txt
| - bagit.txt
| - data
    | - bag-info.txt
    | - bagit.txt
    | - data
    | - manifest-sha256.txt
    | - manifest-sha512.txt
    | - tagmanifest-sha256.txt
    | - tagmanifest-sha512.txt
| - manifest-sha256.txt
| - manifest-sha512.txt
| - tagmanifest-sha256.txt
| - tagmanifest-sha512.txt

Expection

I would expect a warning or even a confirmation, e.g.

>>> bag = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'})

>>> bag_again = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'})
WARNING: You are creating a bag of a bag. Do you want to continue? [N/y] 

Realization

To achieve this behaviour we would need to detect, if a folder is already a bag. One way could be to use the class constructor of Bag and catch the case for valid bag, but continue on Manifest not found.

Example for the make_bag function:

https://github.com/LibraryOfCongress/bagit-python/blob/master/bagit.py#L141C1-L143C3

def make_bag(
    bag_dir, bag_info=None, processes=1, checksums=None, checksum=None, encoding="utf-8"
):

    # ....
    try:
        bagit.Bag(bag_dir)
        while True:
            ans = input("WARNING: You are creating a bag of a bag. Do you want to continue? [N/y]")
            if ans.lower() == 'y':
                break
            elif not ans or ans.lower() == 'n':
                raise UserWarning('Aborted')
    
    except Exception as e:
        pass

    # ...

theatischbein avatar May 31 '25 08:05 theatischbein

I like this as I’ve accidentally made bags within bags when forgetting to add a validate flag! I can’t think of many intentional scenarios where bagging a bag makes sense. Perhaps if this gets approved then there could be also an override flag (like a -y) that will say yes to all baggings of bags in case folks still need that functionality?

Best,

Kieran National Library of Ireland

On Sat 31 May 2025 at 09:40, Thea Tischbein @.***> wrote:

theatischbein created an issue (LibraryOfCongress/bagit-python#186) https://github.com/LibraryOfCongress/bagit-python/issues/186

When creating a bag of bag I would expect a warning or confirmation if I really want to create a bag of a bag.

In general it should be possible to create a bag of a bag, but I would not consider it a default case. Reproduction

bag = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'})

This creates a bag with all info, manifest files and the data directory

/tmp/test_bag | - bag-info.txt | - bagit.txt | - data | - manifest-sha256.txt | - manifest-sha512.txt | - tagmanifest-sha256.txt | - tagmanifest-sha512.txt

If I run make_bag on the same folder again

bag_again = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea})

this creates a bag with all info, manifest files and the data directory of the first level bag

/tmp/test_bag | - bag-info.txt | - bagit.txt | - data | - bag-info.txt | - bagit.txt | - data | - manifest-sha256.txt | - manifest-sha512.txt | - tagmanifest-sha256.txt | - tagmanifest-sha512.txt | - manifest-sha256.txt | - manifest-sha512.txt | - tagmanifest-sha256.txt | - tagmanifest-sha512.txt

Expection

I would expect a warning or even a confirmation, e.g.

bag = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'})

bag_again = bagit.make_bag('/tmp/test_bag', {'Contact-Name': 'Thea'}) WARNING: You are creating a bag of a bag. Do you want to continue? [N/y]

Realization

To achieve this behaviour we would need to detect, if a folder is already a bag. One way could be to use the validate function and catch the case for valid or invalid bag, but continue on Manifest not found.

Example for the make_bag function:

https://github.com/LibraryOfCongress/bagit-python/blob/master/bagit.py#L141C1-L143C3

def make_bag( bag_dir, bag_info=None, processes=1, checksums=None, checksum=None, encoding="utf-8" ):

# ....
try:
    bagit.Bag(bag_dir)
    while True:
        ans = input("WARNING: You are creating a bag of a bag. Do you want to continue? [N/y]")
        if ans.lower() == 'y':
            break
        elif not ans or ans.lower() == 'n':
            raise UserWarning('Aborted')

except Exception as e:
    pass

# ...

— Reply to this email directly, view it on GitHub https://github.com/LibraryOfCongress/bagit-python/issues/186, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAITFPVHGTAKDTBCM2Q5WVL3BFTH7AVCNFSM6AAAAAB6JUUNPCVHI2DSMVQWIX3LMV43ASLTON2WKOZTGEYDKMJWGAZTENQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

kieranjol avatar May 31 '25 12:05 kieranjol

I really like the override flag!

If wanted I could also created a PR, but wanted to discuss it here first.

theatischbein avatar May 31 '25 18:05 theatischbein

A second thought: Maybe it is bad to rely on user input. Especially for integration into automated or parallel scripts.

What do you think of

  • raising a bagit Error "Directory is already a bag"
  • providing a flag for allowing the creating of a bag of bag

theatischbein avatar May 31 '25 20:05 theatischbein