Maosheng Liao issues

Results 10 issues of


                                            Maosheng Liao

为使您的问题得到快速解决，在建立 Issue 前，请您先通过如下方式搜索是否有相似问题: [历史 issue](https://github.com/PaddlePaddle/Paddle-Lite/issues), [FAQ 文档](https://paddle-lite.readthedocs.io/zh/develop/quick_start/faq.html), [官方文档](https://paddle-lite.readthedocs.io/zh/develop/guide/introduction.html) 建立 issue 时，为快速解决问题，请您根据使用情况给出如下信息： - 标题：简洁、精准描述您的问题，例如“ssd 模型转换报错” - 版本、环境信息： 1）Paddle Lite 版本：v2.11 2）Host 环境：MacOS Montery - 模型信息 1）模型名称 [3x3s2_dw.onnx.zip](https://github.com/PaddlePaddle/Paddle-Lite/files/9598723/3x3s2_dw.onnx.zip) 复现： ``` from x2paddle.convert...

测试网站挂了啊 -.-

[Q&A] Why Deepspeed Ulysses could support long sequence length?

Sorry to post the question here. According to the paper, after `all_to_all`, every device will hold `1/P` part of heads, and the it will be sent to perform local attention...

enhancement

Why is no memory access error here?

https://github.com/Dao-AILab/flash-attention/blob/3669b25206d5938e3cc74a5f7860e31c38af8204/csrc/flash_attn/flash_api.cpp#L314-L319 For example, `out_accum ` could be recycled out of this code block/ function, We still use this pointer `oaccum_ptr`, is this valid?

[Feature]: Could you please publish some docs like cuda programming guide?

### Suggestion Description I am mad about finding tutorials/docs for programming in AMD device. I COULDN'T find any docs just like: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html This really makes user mad when they want...

documentation

[HIPIFY][feature] Support for `fp8` data types

### Problem Description Such as when I hipify a cu file having fp8 datatype in it, after the `hipify-clang` command, the fp8 datatype doesn't turned into HIP fp8 type. For...

feature

[QST] Would there have possibilty that kernel's perf differ between unittest and real model?

I encountered a problem when using int8 gemm cutlass kernel: https://github.com/NVIDIA/TensorRT-LLM/issues/2351 For shape [16,6144,4096], I got perf of `14us` in my unittest benchmark, but in real models, I got `25us`....

question

? - Needs Triage

inactive-30d

inactive-90d

Enable flashinfer for dsv2.

## Motivation Enable flashinfer backend for deepseekv2. ## Modifications Only one line. ## Checklist - [x] Format your code according to the [Code Formatting with Pre-Commit](https://docs.sglang.ai/references/contribution_guide.html#code-formatting-with-pre-commit). - [x] Add unit...

Maosheng Liao

请教各位，这个按压时间系数怎么确定的呢？

请问这个库可以在Android下面调用吗？