Results 6 issues of Fan Hong

Hi, I found the program will stuck when calling `System.load` with Py4J on Windows. After simplified, a code snippet for reproduction is as follows: ``` from py4j.java_gateway import launch_gateway, JavaGateway,...

Hi, I found creating a Java array of callbacks is impossible. It seems `_set_item` in class `JavaArray` calls `get_command_part` without providing a `python_proxy_pool`. `get_command_part` then fails. The error stack is...

The auto type conversion in py4j for primitive types are very convenient when calling methods. But, when Java methods has arrays as the parameters, I have to manually create the...

## What is the purpose of the change Add Transformer and Estimator for GBTClassifier and GBTRegressor. Details about features compared to SparkML's implementation are as follows: - Implemented in this...

# ❓ Questions and Help I just found fused kernels in sequence parallel got poor performance in real model training. Here is a snapshot of nsys timeline of a ColumnParallelLinear...

# 🚀 Feature When `seqlen` is not fixed, `triton.autotune` is triggered for every unseen value of `seqlen` for tiled matmul kernel and matmul kernel. It makes the training extremely slow...