Yubo Wang comments

Results 33 comments of


                                            Yubo Wang

Split access token into half and store to avoid "securecookie: the value is too long" error

let me take a look another look at this issue.

[BUG] Error creating secure cookie, caused by: securecookie: the value is too long

@In0ut you can remove some claims from you id token to reduce the size. For me, I removed groups from the id token. @davidmirror-ops yes, this is consistent

[BUG] Error creating secure cookie, caused by: securecookie: the value is too long

I am currently doing a fix in our internal fork but would like to see how @wild-endeavor thinks about the solution. Essentially, we are splitting up the access tokens into...

[BUG] Error creating secure cookie, caused by: securecookie: the value is too long

hey guys, for this issue, the cause is "github.com/gorilla/securecookie"'s hash function will generate a hashed token that exceeds the limit of 4096 of cookie size limit of browsers, which means...

[BUG] Error creating secure cookie, caused by: securecookie: the value is too long

sure, let me see if I can create a PR in a day or two. you can copy the code from the PR and apply locally. Should not be too...

[BUG] Error creating secure cookie, caused by: securecookie: the value is too long

@gdabisias sorry responding a little late due to work. Please refer to https://github.com/flyteorg/flyte/pull/4863 for a sample implmentation

jamba liger fused linear+xentropy

I added the following additional monkey patch for Jamba. ```python from transformers.models.jamba import modeling_jamba if rms_norm: # https://github.com/huggingface/transformers/blob/v4.44.2/src/transformers/models/gemma/modeling_gemma.py#L109 modeling_jamba.JambaRMSNorm = LigerRMSNorm if cross_entropy: modeling_jamba.CrossEntropyLoss = LigerCrossEntropyLoss if swiglu: modeling_jamba.JambaMLP =...

jamba liger fused linear+xentropy

HI @winglian created a PR towards main branch of your fork. Do you want to merge it first and then update this PR to base on that? https://github.com/winglian/Liger-Kernel/pull/1 Or I...

Add support for jamba model with Liger Kernel

`pip install . '[dev]'` fails for this PR after [mamba-ssm](https://github.com/state-spaces/mamba) into the dependecies. The reason is that mamba-ssm has a bug in its setup.py that makes it not PEP 517...

Support FlashAttention3 page_size > 1 and topk > 1 case with paged attn and spec decode

Latest testing result: ``` python3 benchmark/mtbench/bench_sglang_eagle.py --num-questions 80 --parallel 10 --question-file /shared/user/mtbench/question.jsonl ``` **without CUDA graph** page_size = 4 #questions: 80, Throughput: 1168.43 token/s, Acceptance length: 2.66 page_size = 1...