Alexandre Marques

Results 12 issues of Alexandre Marques

This PR creates a base class for knowledge distillation modifiers and changes the existing knowledge distillation modifier to inherit from the base class. The goal of this change is to...

Created the NMTrainingArguments that inherits from HF's TrainingArguments. This class allows one to add arguments to the training script and handling potential conflicts with other arguments. In particular, added "total_train_batch_size"...

Add ultrachat200k for perplexity eval

Replace quant_min/max values with INT8/UINT8 ranges so ONNX export is supported for other bit widths (e.g., 4).

This PR adds a transformation that quantizes weights for weights-only quantization. It was tested on a Llama2 model.

This PR adds support for per-token dynamic quantization. Quantization scales and zero points are computed "on-the-fly" for each new tensor. Each token has its own quantization scale and zero-point (one...

The file_path was being joined twice to the files leading to wrong paths (lines 663 and 683). This PR removes duplicate joining of path.

Define separate AddInput classes for the different branches of the bottleneck. Allows control of quantization via module class in addition to name

Ultrachat200k has 2 splits for training, one for sft and another for dpo. As a result it doesn't have a "train" split per se. This PR allows for a train_sft...

There's no need of a period between the question and the line break since the question will contain its own punctuation (normally interrogation mark). The period also doesn't match the...