Yu (Brian) Yao
Yu (Brian) Yao
I tried to convert a caffemodel of LRCN to a tensorflow model. But got the following error: google.protobuf.text_format.ParseError: 260:3 : Message type "caffe.LayerParameter" has no field named "recurrent_param" Does it...
Hi, thanks for the great work! While reading the code, I noticed that you have used self-implemented version of BLIP and BERT etc. as oppose to directly importing the corresponding...
Thanks for the great work! I'm looking at the IPO loss and DPO losses here: ``` pi_logratios = policy_chosen_logps - policy_rejected_logps ref_logratios = reference_chosen_logps - reference_rejected_logps if reference_free: ref_logratios =...
Not sure if this is original llama3 issue or a issue caused by INT4 AWQ. When I type in "exit", it keeps generating without ending the sentence: