Liangcai Li
Liangcai Li
We are meeting this exception when running a regression training repeatly with the xgboost JVM jars built from the latest master branch using Scala 2.11, along with our spark example...
There will be more than 100 cases. We may need multiple sub issues for this. [Click to see full type casting list CPU ORC supports. ](https://github.com/apache/orc/blob/main/java/core/src/java/org/apache/orc/impl/ConvertTreeReaderFactory.java#L2258)
CPU ORC reading supports schema evolution as discribed in issue #135. But GPU does not. GPU will run into exceptions when users specify a reader schema which is different from...
Test to reproduce this bug. ``` @Test void testPartitionEmptyTable() { try (Table t = new Table.TestBuilder() .timestampDayColumn() .build(); ColumnVector parts = ColumnVector .fromInts(); PartitionedTable pt = t.partition(parts, 3)) { assertArrayEquals(new...
closes https://github.com/NVIDIA/spark-rapids/issues/6313 This PR adds the columnar support for the new API `mapInArrow` which is introduced in Spark 3.3.0. ***Performance*** - About 6.8 GB Parquet data in local files. -...
**Describe the bug** GPU JSON reader can not read the JSON string of an empty body `{}`. But Spark can read it successfully. **Steps/Code to reproduce bug** There are two...
close https://github.com/NVIDIA/spark-rapids/issues/10968 This PR adds the `MinBy` support on GPU. The GPU `MinBy` may produce different results than that of CPU when multiple rows in the ordering column have the...
Contribute to https://github.com/NVIDIA/spark-rapids/issues/10790 Fix https://github.com/NVIDIA/spark-rapids/issues/10841 This PR is trying to accelerate the normal shuffle path by partitioning and slicing tables on GPU. The sliced table is already serializable so can...
**Describe the bug** PR https://github.com/NVIDIA/spark-rapids/pull/10912 introduces the parquet support for `GpuInsertIntoHiveTable`, along with the relevant tests. In some of the tests on Databricks, the `ProjectExec` will fall back to CPU...
close https://github.com/NVIDIA/spark-rapids/issues/11646 `curXORShiftRandomSeed ` is marked as `transient`, so it will be null on executors without retry-restore context, leading to this NPE. This fix removes the `transient` for `curXORShiftRandomSeed`, `seed`...