RumbleML: Missing support for vectors in parameters other than featuresCol
let $data := annotate(
json-file("./src/main/resources/queries/rumbleML/sample-ml-data-flat.json"),
{ "label": "integer", "binaryLabel": "integer", "name": "string", "age": "double", "weight": "double", "booleanCol": "boolean", "nullCol": "null", "stringCol": "string", "stringArrayCol": ["string"], "intArrayCol": ["integer"], "doubleArrayCol": ["double"], "doubleArrayArrayCol": [["double"]] }
)
let $est := get-estimator("BucketedRandomProjectionLSH")
let $tra := $est(
$data,
{ "inputCol": "weight" }
)
for $result in $tra(
$data,
{ }
)
return {
"label": $result.label,
"name": $result.name,
"age": $result.age,
"weight": $result.weight,
"result": $result
}
⚠️ ️Error [err: RBML0003]LINE:6:COLUMN:12:Invalid Param; Parameter provided to BucketedRandomProjectionLSH causes the following error: requirement failed: Column weight must be of type struct<type:tinyint,size:int,indices:array
,values:array > but was actually double.
BucketedRandomProjectionLSH requires a vector in its 'inputCol' param. Currently we only support vectors with an internal trick on the featuresCol
Same applies to the following classes:
Estimators expecting 'inputCol' as a vector:
- BucketedRandomProjectionLSH
- IDF
- MaxAbsScaler
- MinHashLSH
- MinMaxScaler
- PCA
- StandardScaler
- VectorIndexer
Transformers expecting 'inputCol' as a vector:
Edit: These will be extracted to a new issues: ~Transformers 'inputCols' as a vector (each column in inputCols can be a vector):~
~ElementwiseProduct also requires a vector in its scalingVec param.~