Shira Guskin issues

Results 5 issues of


                                            Shira Guskin

F1 score during intermediate layer distillation (SQuAD1.1)

Hello, should I expect a high f1 score when training only the first step (intermediate-layers) distillation on SQuAD1.1? Thanks

Data augmentation for SQuAD1.1

Hello, Could you please elaborate regarding the data-augmentation procedure you used for SQuAD1.1 task? Thank you, Shira

Dynamic speculative decoding is significantly slower than auto-regressive and than speculative decoding generation

Below are my results when running speculative-sampling notebook. **Device:** GPU **Models and drafts:** **Phi-3 pair:** draft_model_id = "OpenVINO/Phi-3-mini-FastDraft-50M-int8-ov" target_model_id = "OpenVINO/Phi-3-mini-4k-instruct-int4-ov" **Llama3.1 pair:** draft_model_id = "OpenVINO/Llama-3.1-8B-Instruct-FastDraft-150M-int8-ov" target_model_id = "fakezeta/Meta-Llama-3.1-8B-Instruct-ov-int4" **Results:**...

PSE

category: GPU

Fail to run model on CPU using IPEX-XPU installation

We want to use speculative decoding where one model runs on the xpu and another model (significantly smaller) runs on the cpu. We installed xpu build and run the script...

CPU

XPU/GPU

Functionality

IPEX v2.5.10: Fail to run inference with quantized Phi-3

### Describe the issue I tried the example in: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model , using `microsoft/Phi-3-mini-4k-instruct `model. It fails with: ``` File "C:\Users\sdp\.cache\huggingface\modules\transformers_modules\0a67737cc96d2554230f90338b163bc6380a2a85\modeling_phi3.py", line 1305, in prepare_inputs_for_generation elif past_length < input_ids.shape[1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError:...