Farzad Abdolhosseini comments

Results 22 comments of


                                            Farzad Abdolhosseini

`onSelected`: Keep default behavior of events

No, I didn't have many clues to go on really.

there is a code defect need to fix

Yes, if you disable `learn_augmentation`, you should also change both the loss to `cross_entropy` and `stats.checkpoint_metric` to `top1` (instead of `top1.logits`).

Training process of Mask R-CNN crashes when workers parameter is greater than 0

Original question: it seems to me like you’re running out of memory (RAM) and are getting OOM killed. I’d check the following a) RAM size (free) b) the dataset implementation...

About Dataset collate fn

Hi, I'm really confused about what the question is here. Please say: 1) what you did (e.g. a partial code containing what is registered) 2) what you expected 3) what...

while git clone the submodule, always denied by the gihub.I wonder why is this happen..

It seems like the problem is that one of the submodules (i.e. the one that has the MuJoCo Cassie) is hosted on Gitlab. A quick way to fix it on...

Problem about mirror_inds["sideneg_obs_inds"] / mirror_inds["sideneg_act_inds"] in sym_envs.py

Hmm, I haven't thought about this in a long time but you might be right. I did a quick check to see if this was handled in code (i.e. you...

Problem about mirror_inds["sideneg_obs_inds"] / mirror_inds["sideneg_act_inds"] in sym_envs.py

Yeah I'll still need to spend more time on it, this is when I really hope I had better comments in the first place 😅 The short answer is that...

Fail to load fixie-ai/ultravox-v0_4_1-llama-3_1-70b with device_map 'auto'

Hi there, I've taken a look before and I wasn't able to get any good performance (if at all) out of it, so currently for 70B inference we use VLLM...

Image + Audio + Text input using Llama 3.2 [DO NOT MERGE]

@kadirnar this was simply a proof of concept. Unfortunately, combining vision into Ultravox is not part of our roadmap. What is your use-case?

Image + Audio + Text input using Llama 3.2 [DO NOT MERGE]

The model in this PR will only be able to output text (not speech), but on the input side, yes it does allow for a combination of all three modalities:...