pass-lin issues

Results 19 issues of


                                            pass-lin

About Multi-Backend Implementation of Gradient Checkpointing question

I tried to implement Multi-Backend Gradient Checkpointing in https://github.com/pass-lin/bert4keras3 But I encounter some problems, such as when I implement in the tf backend ```python class ScaleOffset(Layer): def __init__( self, scale=True,...

type:Bug

Why does the sequence length of vectors affect the calculation results of dense under bf16?

import os os.environ['KERAS_BACKEND'] = 'torch' os.environ['OPS_KERNAL'] = '1' import keras keras.config.set_floatx('bfloat16') from keras import ops import numpy as np initial_dim = 2048 finally_dim = 64 z = ops.convert_to_tensor(np.random.random([1,36,initial_dim])) dense =...

stat:awaiting keras-eng

type:Bug

Add qwen2 support

The qwen2 model is the sota of the hf leaderboard. And compared with the llama model, there is only one more bias in the qkv dense of mha. Therefore, only...

type:feature

help wanted

stat:contributions welcome

Any plans for more Llama type models?

Is there a plan to support more models similar to llama, which only need simple modifications or even no modifications to the existing llama backbone to achieve compatibility? for example...

type:feature

Discovered a magic bug in keras, which is caused by incoming data

Here is an example code, where bert4keras3 is my own llm library, which can be installed through pip ```python import json config = { "type_vocab_size": 2, "use_bias": 0, "o_bias": 0,...

type:Bug

backend:tensorflow

backend:jax

Videoswin cannot load from preset

I followed the instructions in the document at https://keras.io/api/keras_cv/models/tasks/deeplab_v3_segmentation/ and encountered an error when I input the following code: ```python from keras_cv.models import DeepLabV3Plus model = DeepLabV3Plus.from_preset("videoswin_tiny_kinetics400") ``` error is...

type:Bug

type:docs

stat:awaiting response from contributor

stale

Special optimization for torch backend

PyTorch provides specialized cudnn operators for many common implementations, such as F.group_norm In the past speed test benchmarks of the keras-team, we can also find that the torch backend of...

stat:contributions welcome

type:feature

pass-lin

About Multi-Backend Implementation of Gradient Checkpointing question

Why does the sequence length of vectors affect the calculation results of dense under bf16?

Add qwen2 support

Any plans for more Llama type models?

Discovered a magic bug in keras, which is caused by incoming data

Videoswin cannot load from preset

请问完整数据集什么时候会公布

mawps数据集可以在qwen math的仓库里下载

Documentation update guide required

Special optimization for torch backend