Cannot convert geojson to PASCALVOS2012 format using createDataSpaceNet.py
Hi, I followed the instructions in README and ran the following command: python spacenetutilities/scripts/createDataSpaceNet.py /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train --srcImageryDirectory RGB-PanSharpen --outputDirectory /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations --annotationType PASCALVOC2012 --convertTo8Bit --imgSizePix 400
The traceback is like this:
fullpathImageDirectory = /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen fullpathGeoJsonDirectory = /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/geojson/buildings [['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif', 'RGB-PanSharpen']] buildings | 0.00, 0.00, 121.61| | 0.00,-0.00, 31.42| | 0.00, 0.00, 1.00| Creating Chips: 0%| | 0/4 [00:00<?, ?it/s]Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating Chips: 50%|██████████████████████████████████████████████████████████ | 2/4 [00:00<00:00, 16.34it/s]Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating output file that is 400P x 400L. Processing /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif [1/1] : 0...10...20...30...40...50...60...70...80...90...100 - done. Creating Chips: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 16.11it/s] ['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations/geojson/buildings/buildings__121.6147392_31.4137359.geojson'] Traceback (most recent call last): File "spacenetutilities/scripts/createDataSpaceNet.py", line 321, in <module> bboxResize= args.boundingBoxResize File "spacenetutilities/scripts/createDataSpaceNet.py", line 88, in processChipSummaryList bboxResize=bboxResize File "/home/yuankunhao/datasets/spacenet/utilities/spacenetutilities/labeltools/pascalVOCLabel.py", line 212, in geoJsonToPASCALVOC2012 borderValue=255 File "/home/yuankunhao/datasets/spacenet/utilities/spacenetutilities/labeltools/pascalVOCLabel.py", line 117, in geoJsonToPASCALVOC2012SegmentCls source_layer = gpd.read_file(geoJson) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/geopandas/io/file.py", line 71, in read_file with reader(path_or_bytes, **kwargs) as features: File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/env.py", line 397, in wrapper return f(*args, **kwargs) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/__init__.py", line 249, in open path = parse_path(fp) File "/root/anaconda3/envs/spacenet/lib/python3.7/site-packages/fiona/path.py", line 132, in parse_path elif path.startswith('/vsi'): AttributeError: 'list' object has no attribute 'startswith'
The printed line [['/home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/RGB-PanSharpen/RGB-PanSharpen_AOI_4_Shanghai_img1001.tif', 'RGB-PanSharpen']] is what i inspected in pascalVOCLabel.py 117 line, the geoJson in source_layer = gpd.read_file(geoJson)
Does anyone know what's happened and how to convert correctly to PASCALVOC2012 format?
Many thanks!
From my observation geoJson been passed into gpd.read_file(geoJson) is exactly a list object but the function accepts a string or url to read. Is this a bug?
I am getting this same error when running python spacenetutilities/scripts/createDataSpaceNet.py --srcImageryDirectory RGB-PanSharpen --outputDirectory /home/yuankunhao/datasets/spacenet/AOI_4_Shanghai_Train/annotations --annotationType PASCALVOC2012 --convertTo8Bit --imgSizePix 400
My OS: I am running a macOS Sierra - Version 10.13.6
Which version of spacenetutilities are you using?We recommend using the V3 branch for working with those data.
I am on branch spacenetV3. :)
What I did was installed the dependencies then I pulled down the repo (spacenetV3). After this I went to the directory spacenetutilities/scripts and then ran the command
python createDataSpaceNet.py /AOI_2_Vegas/AOI_2_Vegas_Train/
--srcImageryDirectory RGB-PanSharpen
--outputDirectory /AOI_2_Vegas/annotations/
--annotationType PASCALVOC2012
--imgSizePix 400
OK. Good to know. There are a few long-standing issues with this project that will be resolved in a related project to be announced shortly. In the meantime, I'm not entirely sure why your image path is being encapsulated in list(list(path)) rather than just list(path), but that's presumably the issue. I'd start there.
Sorry I can't be of more help!
Okay, sounds good. I will look into this and if I find a solution to this I will post it on here so if others run in to it they can have a short term solution until the announcement comes 😄
So it seeems that gpd.read_file(geoJson) is looking for an attribute that has 'startswith' but there is no attribute in the json file (example: buildings_AOI_2_Vegas_img2364.geojson) that has the name: 'startswith'
i was using V3 branch as well. I finally gave this up and converted the annotation myself.
I've forked and added
...
if isinstance(geoJson, list):
geoJson = geoJson[0]
source_layer = gpd.read_file(geoJson) # this was existing
...
at line ~116 and ~ 144 in labeltools/pascalVOCLabel.py. This got me past the error described in this issue, but lead me to another issue:
fiona.errors.DriverError: '/qfs/projects/sgdatasc/spacenet/Vegas_processed_train/annotations/geojson/buildings/buildings__-115.3075176_36.1265426997.geojson' not recognized as a supported file format.
I'm going to look to clean up this error and if it works, create a pull request.
It looks like the second error corresponds to an empty geojson file, so this issue should be done. I'll create a pull request.
@alexhagen thanks! We appreciate it.
That DriverError is a common issue for empty geojsons. I'd recommend adding the following block to catch it:
# at head of the file
from fiona.errors import DriverError
from fiona._err import CPLE_OpenFailedError # old versions of fiona threw this error instead
try:
source_layer = gpd.read_file(geoJson)
except (DriverError, CPLE_OpenFailedError):
source_layer = gpd.read_file()