rasterframes icon indicating copy to clipboard operation
rasterframes copied to clipboard

Document use of java config params

Open vpipkt opened this issue 5 years ago • 3 comments

Accessing S3 buckets, even with a public bucket, requires passing some Java options along. We also can use java options to choose whether to prefer the GDAL reader and other things.

Here is a quick example of using unsigned requests. Have to pass AWS_NO_SIGN_REQUEST in so that geotrellis.raster.gdal.option configuration is set.

Should add a dicsucssion about this generally to docs pages.

import pyrasterframes
from pyrasterframes.utils import create_rf_spark_session

spark = create_rf_spark_session(**{'spark.driver.extraJavaOptions': '-Dgeotrellis.raster.gdal.option.AWS_NO_SIGN_REQUEST=YES'})

df = spark.read.raster('s3://s22s-test-geotiffs/luray_snp/B11.jp2')
df.count()

vpipkt avatar Jan 12 '21 15:01 vpipkt

FWIW the exact option given there does not help us do anonymous reads. To do that: os.environ['AWS_NO_SIGN_REQUEST'] = 'YES' before creating the spark session.

vpipkt avatar Jan 13 '21 14:01 vpipkt

Hi @vpipkt , I tried to read 's3://s22s-test-geotiffs/luray_snp/B11.tif' under RasterFrames environment, and the before issue has been resolved. However, if I read 's3://s22s-test-geotiffs/luray_snp/B11.jp2', after 'df.count()' was been executed, the error "CPLE_OpenFailed(4) "Open failed." Unable to open EPSG support file gcs.csv. Try setting the GDAL_DATA environment variable to point to the directory containing EPSG csv files." has been taken place. 1 2 3

JenniferYingyiWu2020 avatar Jan 14 '21 02:01 JenniferYingyiWu2020

@JenniferYingyiWu2020 in the interest of keeping this issue focused, lets try to resolve this in Gitter if possible. It seem that the CPLE_OpenFailed(4) is a common GDAL configuration problem, not specific to RasterFrames itself.

vpipkt avatar Jan 14 '21 19:01 vpipkt