Shira Guskin comments

Results 10 comments of


                                            Shira Guskin

RuntimeError: index out of range at /pytorch/aten/src/TH/generic/THTensorEvenMoreMath.cpp:191

It happened to me when I had out-of-vocabulary words which were assigned a -1 value, and also it happens when you set the vocab-size to a smaller value than the...

F1 score during intermediate layer distillation (SQuAD1.1)

TinyBERT. I noticed the network is vert sensitive to initialization. If I change seeds I can end up with a very poor f1 results when training the intermediate-layer-distillation.

F1 score during intermediate layer distillation (SQuAD1.1)

I mean, what is the expected score to get for **intermediate-layer** distillation (TinyBERT6, SQuAD1.1- currently I'm getting f1~30, is that ok?) , and should I continue with **best** or **last**...

failed to load Boot0002

IPEX v2.5.10: Fail to run inference with quantized Phi-3

I'm running on Windows 11, Intel(R) Core(TM) Ultra 5. transformers version: 4.44.2 GPU driver: Intel(R) Arc(TM) 130V GPU (16GB) driver version: 32.0.101.6325 I set up a fresh conda environment with...

IPEX v2.5.10: Fail to run inference with quantized Phi-3

Hi, thank you for the reference. I used the following script: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model, and in line 10 it sets: 'use_hf_code = True'. This is the cause for my failure when using...

IPEX v2.5.10: Fail to run inference with quantized Phi-3

which logs do you need? Following is the error I get if setting : 'use_hf_code = True'. ``` File "C:\Users\sdp\.cache\huggingface\modules\transformers_modules\0a67737cc96d2554230f90338b163bc6380a2a85\modeling_phi3.py", line 1305, in prepare_inputs_for_generation elif past_length < input_ids.shape[1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TypeError:...

Dynamic speculative decoding is significantly slower than auto-regressive and than speculative decoding generation

I'm not sure there should be such slowness also with FastDraft. To reproduce the results I run the [speculative-sampling notebook](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/speculative-sampling/speculative-sampling.ipynb), while setting device = GPU and setting the prompt to...

Dynamic speculative decoding is significantly slower than auto-regressive and than speculative decoding generation

I use the master branch. I use a LNL system: Windows 11 Intel(R) Core(TM) Ultra 5 238V 2.10GHz GPU: Intel(R) Arc(TM) 130V GPU gfx-driver-ci-master-17368 DCH RI (16GB) It might reproduce...

Dynamic speculative decoding is significantly slower than auto-regressive and than speculative decoding generation

I haven't been working on this for a while, I'm ok with closing the issue.