Allen Nie

Results 12 comments of Allen Nie

Hey @iassael did you have different mean/variance for each timestep? Or a shared mean/variance over all timesteps of one batch? The paper said " Consequently, we recommend using separate statistics...

I literally encountered the same issue LOL. I think it may be because he didn't do any smoothing technique.

Hi Ryan, I'm trying to implement the algorithm in PyTorch :) But the hashing :( does your Java implementation has GPU-based hashing? Since you mentioned that implementing this in PyTorch/TF...

If for now I ignore the inefficiency that occurs during hashing of my weight matrices (for every several batches), I should write a CUDA C++ file that does SRP (which...

Hello, thank you for bringing up this issue. Did you download the model and try to use the code provided in README to load it? Can you describe a bit...

We are also working on a community contribution guideline to allow easier iteration.

@microsoft-github-policy-service agree

The merge changed optoprime's signature to match the one in the experimental branch

Ohhh I believe the current AutoGen model specification is through: ``` AutoGenLLM(filter_dict={"model": [model]}) ``` Then if this is the case, a simple modification would work -- just change rewrite the...