Shira Guskin

Results 5 issues of Shira Guskin

Hello, should I expect a high f1 score when training only the first step (intermediate-layers) distillation on SQuAD1.1? Thanks

Hello, Could you please elaborate regarding the data-augmentation procedure you used for SQuAD1.1 task? Thank you, Shira

Below are my results when running speculative-sampling notebook. **Device:** GPU **Models and drafts:** **Phi-3 pair:** draft_model_id = "OpenVINO/Phi-3-mini-FastDraft-50M-int8-ov" target_model_id = "OpenVINO/Phi-3-mini-4k-instruct-int4-ov" **Llama3.1 pair:** draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M-int8-ov" target_model_id = "fakezeta/Meta-Llama-3.1-8B-Instruct-ov-int4" **Results:**...

PSE
category: GPU

We want to use speculative decoding where one model runs on the xpu and another model (significantly smaller) runs on the cpu. We installed xpu build and run the script...

CPU
XPU/GPU
Functionality

### Describe the issue I tried the example in: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model , using `microsoft/Phi-3-mini-4k-instruct `model. It fails with: ``` File "C:\Users\sdp\.cache\huggingface\modules\transformers_modules\0a67737cc96d2554230f90338b163bc6380a2a85\modeling_phi3.py", line 1305, in prepare_inputs_for_generation elif past_length < input_ids.shape[1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError:...