Yubo Wang
Yubo Wang
let me take a look another look at this issue.
@In0ut you can remove some claims from you id token to reduce the size. For me, I removed groups from the id token. @davidmirror-ops yes, this is consistent
I am currently doing a fix in our internal fork but would like to see how @wild-endeavor thinks about the solution. Essentially, we are splitting up the access tokens into...
hey guys, for this issue, the cause is "github.com/gorilla/securecookie"'s hash function will generate a hashed token that exceeds the limit of 4096 of cookie size limit of browsers, which means...
sure, let me see if I can create a PR in a day or two. you can copy the code from the PR and apply locally. Should not be too...
@gdabisias sorry responding a little late due to work. Please refer to https://github.com/flyteorg/flyte/pull/4863 for a sample implmentation
I added the following additional monkey patch for Jamba. ```python from transformers.models.jamba import modeling_jamba if rms_norm: # https://github.com/huggingface/transformers/blob/v4.44.2/src/transformers/models/gemma/modeling_gemma.py#L109 modeling_jamba.JambaRMSNorm = LigerRMSNorm if cross_entropy: modeling_jamba.CrossEntropyLoss = LigerCrossEntropyLoss if swiglu: modeling_jamba.JambaMLP =...
HI @winglian created a PR towards main branch of your fork. Do you want to merge it first and then update this PR to base on that? https://github.com/winglian/Liger-Kernel/pull/1 Or I...
`pip install . '[dev]'` fails for this PR after [mamba-ssm](https://github.com/state-spaces/mamba) into the dependecies. The reason is that mamba-ssm has a bug in its setup.py that makes it not PEP 517...
Latest testing result: ``` python3 benchmark/mtbench/bench_sglang_eagle.py --num-questions 80 --parallel 10 --question-file /shared/user/mtbench/question.jsonl ``` **without CUDA graph** page_size = 4 #questions: 80, Throughput: 1168.43 token/s, Acceptance length: 2.66 page_size = 1...