Fix phototour dataset
Fixes this issue https://github.com/pytorch/vision/issues/8732
:link: Helpful Links
:test_tube: See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/vision/8733
- :page_facing_up: Preview Python docs built from this PR
Note: Links to docs will display an error until the docs builds have been completed.
:heavy_exclamation_mark: 1 Active SEVs
There are 1 currently active SEVs. If your PR is affected, please view them below:
This comment was automatically generated by Dr. CI and updates every 15 minutes.
Code to test the changes
import torch
from torchvision.datasets import PhotoTour
def test_phototour_new():
# Define the root directory where datasets will be stored
root = "./datasets"
# List of datasets to test
datasets = ["trevi", "notredame", "halfdome"]
for dataset_name in datasets:
print(f"\nTesting dataset: {dataset_name}")
# Initialize the dataset
dataset = PhotoTour(
root=root,
name=dataset_name,
train=True,
transform=None, # No need for transforms
download=True, # Download the datasets if not already present
)
# Check if the dataset has been loaded successfully
assert len(dataset) > 0, f"Dataset {dataset_name} is empty!"
print(f"Number of patches in {dataset_name}: {len(dataset)}")
# Retrieve a sample
sample = dataset[0]
print(f"Sample type for {dataset_name}: {type(sample)}")
if isinstance(sample, torch.Tensor):
print(f"Sample shape: {sample.shape}")
# Print the mean and standard deviation of the dataset
print(f"Mean: {dataset.mean}, Std: {dataset.std}")
# Access a few samples to verify functionality
for i in range(min(5, len(dataset))): # Test first 5 samples
try:
data = dataset[i]
print(f"Sample {i}: {data.shape if isinstance(data, torch.Tensor) else type(data)}")
except Exception as e:
print(f"Error accessing sample {i}: {e}")
if __name__ == "__main__":
test_phototour_new()
output
Testing dataset: trevi
Downloading https://phototour.cs.washington.edu/patches/trevi.zip to ./datasets/trevi.zip
100.0%
Number of patches in trevi: 101120
Sample type for trevi: <class 'torch.Tensor'>
Sample shape: torch.Size([64, 64])
Mean: 0.4832, Std: 0.1913
Sample 0: torch.Size([64, 64])
Sample 1: torch.Size([64, 64])
Sample 2: torch.Size([64, 64])
Sample 3: torch.Size([64, 64])
Sample 4: torch.Size([64, 64])
Testing dataset: notredame
Downloading https://phototour.cs.washington.edu/patches/notredame.zip to ./datasets/notredame.zip
100.0%
Skipping invalid patch at (0, 0) in ./datasets/notredame/patches0000 (Custom).bmp: cannot reshape array of size 12288 into shape (64,64)
Skipping invalid patch at (64, 0) in ./datasets/notredame/patches0000 (Custom).bmp: cannot reshape array of size 12288 into shape (64,64)
Skipping invalid patch at (0, 64) in ./datasets/notredame/patches0000 (Custom).bmp: cannot reshape array of size 12288 into shape (64,64)
Skipping invalid patch at (64, 64) in ./datasets/notredame/patches0000 (Custom).bmp: cannot reshape array of size 12288 into shape (64,64)
Number of patches in notredame: 104192
Sample type for notredame: <class 'torch.Tensor'>
Sample shape: torch.Size([64, 64])
Mean: 0.4757, Std: 0.1931
Sample 0: torch.Size([64, 64])
Sample 1: torch.Size([64, 64])
Sample 2: torch.Size([64, 64])
Sample 3: torch.Size([64, 64])
Sample 4: torch.Size([64, 64])
Testing dataset: halfdome
Downloading https://phototour.cs.washington.edu/patches/halfdome.zip to ./datasets/halfdome.zip
100.0%
Number of patches in halfdome: 107776
Sample type for halfdome: <class 'torch.Tensor'>
Sample shape: torch.Size([64, 64])
Mean: 0.4718, Std: 0.1791
Sample 0: torch.Size([64, 64])
Sample 1: torch.Size([64, 64])
Sample 2: torch.Size([64, 64])
Sample 3: torch.Size([64, 64])
Sample 4: torch.Size([64, 64])
Pytest for download
pytest test/test_datasets_download.py -vvv -k phototour
test/test_datasets_download.py::test_url_is_accessible[PhotoTour, https://phototour.cs.washington.edu/patches/halfdome.zip] PASSED [ 33%]
test/test_datasets_download.py::test_url_is_accessible[PhotoTour, https://phototour.cs.washington.edu/patches/notredame.zip] PASSED [ 66%]
test/test_datasets_download.py::test_url_is_accessible[PhotoTour, https://phototour.cs.washington.edu/patches/trevi.zip] PASSED [100%]
pytest for dataset
pytest test/test_datasets.py -k PhotoTour
test/test_datasets.py ......s [100%]
======================================== 6 passed, 1 skipped, 570 deselected in 0.26s ========================================
Thanks venkatram-dev
That is the different dataset version (earlier one), than it was in torchvision. Here is the one, which we had at torchvision https://cmp.felk.cvut.cz/~mishkdmy/datasets/BrownPhotoTour/
but the issue that I don't have original zips, only the images and labels themselves, so re-packed zips have different hashes.
Here is the PR with original data, not different scenes https://github.com/pytorch/vision/pull/9002 @NicolasHug