KBLaM icon indicating copy to clipboard operation
KBLaM copied to clipboard

Release artifacts (models, dataset) on Hugging Face

Open NielsRogge opened this issue 11 months ago • 8 comments

Hi @ti250 🤗

Niels here from the open-source team at Hugging Face. I discovered your work on Arxiv and was wondering whether you would like to submit it to hf.co/papers to improve its discoverability.If you are one of the authors, you can submit it at https://huggingface.co/papers/submit.

The paper page lets people discuss about your paper and lets them find artifacts about it (your models, datasets or demo for instance), you can also claim the paper as yours which will show up on your public profile at HF.

It'd be great to make the checkpoints and dataset available on the 🤗 hub, to improve their discoverability/visibility. We can add tags so that people find them when filtering https://huggingface.co/models and https://huggingface.co/datasets.

Uploading models

See here for a guide: https://huggingface.co/docs/hub/models-uploading.

In this case, we could leverage the PyTorchModelHubMixin class which adds from_pretrained and push_to_hub to any custom nn.Module. Alternatively, one can leverages the hf_hub_download one-liner to download a checkpoint from the hub.

We encourage researchers to push each model checkpoint to a separate model repository, so that things like download stats also work. We can then also link the checkpoints to the paper page.

Uploading dataset

Would be awesome to make the dataset available on 🤗 , so that people can do:

from datasets import load_dataset

dataset = load_dataset("your-hf-org-or-username/your-dataset")

See here for a guide: https://huggingface.co/docs/datasets/loading.

Besides that, there's the dataset viewer which allows people to quickly explore the first few rows of the data in the browser.

Let me know if you're interested/need any help regarding this!

Cheers,

Niels ML Engineer @ HF 🤗

NielsRogge avatar Feb 17 '25 09:02 NielsRogge

Hi KBLaM team!

With the official announcement, would this be a suitable moment to reconsider releasing the models and datasets on HuggingFace or somewhere else in the open? It would help the open scientific community tremendously in adopting this tech!

EwoutH avatar Mar 27 '25 08:03 EwoutH

I pinged them on Slack, but the invitation expired :/

NielsRogge avatar Mar 27 '25 09:03 NielsRogge

Sorry about this dropping the ball on this, could you please ping me again? :)

ti250 avatar Apr 10 '25 09:04 ti250

Sure you could give me your email address so I can invite you?

NielsRogge avatar Apr 10 '25 09:04 NielsRogge

[email protected] :)

ti250 avatar Apr 14 '25 12:04 ti250

Thanks, I sent an invite

NielsRogge avatar Apr 14 '25 13:04 NielsRogge

Hi KBLaM team!

With the official announcement, would this be a suitable moment to reconsider releasing the models and datasets on HuggingFace or somewhere else in the open? It would help the open scientific community tremendously in adopting this tech!

Hi KBLaM team!

Just wanted to echo support for this request - having the models released on HuggingFace (or somewhere else) would be incredibly helpful for the community. It would make experimenting and building on top of your work much more straightforward.

Seeing recent replies - is there any rough timeline you can share on when the models might be released?

Thanks again for the great work!

yolandal avatar Apr 22 '25 15:04 yolandal

Second the above

piercelamb avatar Apr 22 '25 15:04 piercelamb