Dataset: CAMEL Math and Physics
50K+ Math, Physics and Code inputs and outputs, sounds tasty but hidden in a zip file.
https://github.com/lightaime/camel#data-hosted-on-hugging-face
via https://twitter.com/hammh0a/status/1646524135538065409
Looks like a really good dataset. I can probably convert to OA format.
More sciences added, looks like in a more accessible format:
https://huggingface.co/datasets/camel-ai/chemistry https://huggingface.co/datasets/camel-ai/biology
via https://github.com/lightaime/camel#data-hosted-on-hugging-face
On another look we might have to reconsider these datasets -- The license says that they are for research use only, not commercial. And, some was generated with GPT4
edit: all good HF shows okay license
Looks like this is completed. I'm going to close it out.