Issue running getDbCovariateData on snowflake
Trying to run FeatureExtraction on snowflake which requires a temp emulation schema. Currently the function only allows for oracleTempSchema, which should be updated to tempEmulationSchema so it is not db exclusive. This is what I used below
#run FE
cov <- FeatureExtraction::getDbCovariateData(
connection = con,
oracleTempSchema = tempEmulationSchema,
cdmDatabaseSchema = cdmDatabaseSchema,
cohortTable = cohortTable,
cohortDatabaseSchema = cohortDatabaseSchema,
cohortId = targetCohortIds,
covariateSettings = covSettings,
aggregated = TRUE
)
for which I get this error:
Not sure if others have got this issue or if I am setting this up wrong.
@mdlavallee92 - I don't have access to Snowflake so this will be hard to troubleshoot.
That said, we can add a proper tempEmulationSchema parameter and deprecate the oracleTempSchema so that this is handled consistently in the package.
Note of how that might look from DatabaseConnector docs: https://ohdsi.github.io/DatabaseConnector/reference/renderTranslateExecuteSql.html
Linking to #159 since we may want to do this as a major, breaking change.
@mdlavallee92 since version 3.7.0 you can pass the tempEmulationSchema in the getDbCovariateData function. Can you try if this works for you on Snowflake?
There is still an issue in 3.7.0. While tempEmulationSchema is passed in here, I'm not sure it's making its way all the way down into the java query builders.
> covariateData2 <- getDbCovariateData(
+ connectionDetails = connectionDetails,
+ cdmDatabaseSchema = executionSettings$cdmDatabaseSchema,
+ cohortDatabaseSchema = executionSettings$workDatabaseSchema,
+ cohortTable = executionSettings$cohortTable,
+ cohortIds = c(99994, 99995),
+ covariateSettings = covariateSettings,
+ aggregated = TRUE,
+ tempEmulationSchema = executionSettings$workDatabaseSchema
+ )
Connecting using Snowflake driver
Constructing features on server
|================================================================================================================================================================| 100%
Executing SQL took 1.15 mins
Fetching data from server
Error in `.createErrorReport()`:
! Error executing SQL:
net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
Object 'OXENIM4HCOV_1' does not exist or not authorized.
An error report has been created at /home/sadowskk/o428/errorReportSql.txt
Run `rlang::last_trace()` to see where the error occurred.
And here is the error report - it got quite far along, so I think that the table was successfully created in the tempEmulationSchema, but then a later query is looking for it in the public schema.
DBMS:
snowflake
Error:
net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
Object 'OXENIM4HCOV_1' does not exist or not authorized.
SQL:
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value
FROM (
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value FROM oxenim4hcov_1 UNION ALL
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value FROM oxenim4hcov_2
) all_covariates;
Hey @ginberg and @katy-sadowski I believe the bug is on a line in the getDbDefaultCovariateData function when trying to extract continuous features. See here.
When running FE v3.7.1 on snowflake it works with the tempEmulationSchema setting if my covSettings are binary. Believe Katy has a continuous setting for covariates leading to the error.
hi @mdlavallee92 you are right we should use tempEmulationSchema here, thanks for the catch! @katy-sadowski I have created a branch with the updated code, could you try it out?