Mycat

Results 36 comments of Mycat

十年磨一剑的结果,自然与众不同

latest (nigthly )torch 2.0 same error ,but --per_device_train_batch_size 2 --gradient_accumulation_steps 1 ok, --per_device_train_batch_size set 3 then an illegal memory access was encountered #82

╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging...

when i remove follow code ,the error gone ,but i dont know it's correct ? # Tokenizer tokenizer = AutoTokenizer.from_pretrained( args.model_name_or_path, cache_dir=args.cache_dir, padding_side="right", use_fast=True, ) # if tokenizer.pad_token is None:...

TypeError: pad_sequence(): argument 'padding_value' (position 3) must be float, not NoneType │ │ /wzh/qlora/qlora.py:417 in __call__ │ │ │ │ 414 │ │ │ else: │ │ 415 │ │...

│if i change code to following if tokenizer.pad_token is None: **tokenizer.pad_token = tokenizer.unk_token** # smart_tokenizer_and_embedding_resize( # special_tokens_dict=dict(pad_token=DEFAULT_PAD_TOKEN), # tokenizer=tokenizer, # model=model, # ) **then maximum recursion depth exceeded while getting...

change code to following , if tokenizer.pad_token is None: #tokenizer.pad_token = tokenizer.unk_token **tokenizer.add_special_tokens(dict(pad_token=DEFAULT_PAD_TOKEN))** #smart_tokenizer_and_embedding_resize( # special_tokens_dict=dict(pad_token=DEFAULT_PAD_TOKEN), # tokenizer=tokenizer, # model=model, #)' error : ../aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [68,0,0], thread: [26,0,0] Assertion...

code has adding the [PAD] logic as following if tokenizer.pad_token is None: smart_tokenizer_and_embedding_resize( **special_tokens_dict=dict(pad_token=DEFAULT_PAD_TOKEN),** tokenizer=tokenizer, model=model, ) """ print("token ..."+str(special_tokens_dict)) **num_new_tokens = tokenizer.add_special_tokens(special_tokens_dict)** model.resize_token_embeddings(len(tokenizer)) runtime output loaded model Using pad_token,...