Dan Fu
Dan Fu
Actually I remembered that this is not how this code works. self.L = L creates a kernel of length L that gets padded implicitly up to 2L later on. self.L...
Code here: https://github.com/HazyResearch/safari We don’t have a config for fine tuning, but will look to add it soon! On Wed, Mar 8, 2023 at 2:10 PM ksrinivs64 ***@***.***> wrote: >...
Thank you for the kind words! ``` If you do not have the time to cobble it together can you provide some hints on how a fine-tuning harness can be...
Releasing the full training script is in our roadmap - will post an update here when we have more details about timing.
These are available here now: https://github.com/HazyResearch/safari
Thanks for your interest! We plan to update the arxiv with the full evaluations soon. For now, we have the PPL of the 2.7B model against GPT-Neo-2.7B on the Pile:...
This is updated in the arXiv now: https://arxiv.org/abs/2212.14052
Yes, we plan to release the synthetics next week. Will post here with an update when we do!
These are available here now: https://github.com/HazyResearch/safari
Pushed now! Let us know if you run into any other problems.