pass-lin
pass-lin
I tried to implement Multi-Backend Gradient Checkpointing in https://github.com/pass-lin/bert4keras3 But I encounter some problems, such as when I implement in the tf backend ```python class ScaleOffset(Layer): def __init__( self, scale=True,...
import os os.environ['KERAS_BACKEND'] = 'torch' os.environ['OPS_KERNAL'] = '1' import keras keras.config.set_floatx('bfloat16') from keras import ops import numpy as np initial_dim = 2048 finally_dim = 64 z = ops.convert_to_tensor(np.random.random([1,36,initial_dim])) dense =...
The qwen2 model is the sota of the hf leaderboard. And compared with the llama model, there is only one more bias in the qkv dense of mha. Therefore, only...
Is there a plan to support more models similar to llama, which only need simple modifications or even no modifications to the existing llama backbone to achieve compatibility? for example...
Here is an example code, where bert4keras3 is my own llm library, which can be installed through pip ```python import json config = { "type_vocab_size": 2, "use_bias": 0, "o_bias": 0,...
I followed the instructions in the document at https://keras.io/api/keras_cv/models/tasks/deeplab_v3_segmentation/ and encountered an error when I input the following code: ```python from keras_cv.models import DeepLabV3Plus model = DeepLabV3Plus.from_preset("videoswin_tiny_kinetics400") ``` error is...
https://github.com/QwenLM/Qwen2.5-Math/tree/main/evaluation/data
Can keras-team add a guide on how to update the documentation? So far, I have submitted to roformerV2, modelscope download source, and Muon optimizer for keras and keras_hub projects. But...
PyTorch provides specialized cudnn operators for many common implementations, such as F.group_norm In the past speed test benchmarks of the keras-team, we can also find that the torch backend of...