pmlb icon indicating copy to clipboard operation
pmlb copied to clipboard

AI Feynman datasets

Open aminravanbakhsh opened this issue 1 year ago • 4 comments

I am trying to fetch a dataset form AI Feynman but I receive the following error:

from pmlb import fetch_data

name = "feynman_III_12_43" dataset = fetch_data(name)

ValueError: Dataset not found in PMLB.

aminravanbakhsh avatar Sep 11 '24 16:09 aminravanbakhsh

Hi @aminravanbakhsh Which version of PMLB are you running? I managed to fetch this dataset without problems. I'm using python==3.8.19 and pmlb==1.0.2a.

Two possible solutions:

  1. Install pmlb from the source. Clone this repo and do pip install . from its root . That's how I installed it here. I'm using a conda environment specifically for building PMLB at its latest version.
  2. Download the dataset folder from this repo (https://github.com/EpistasisLab/pmlb/tree/master/datasets/feynman_III_12_43), put it into a local folder, and use fetch_data(name, local_dir='<path to the folder>'), it should work, as long as the name of the folder and the .tsv.gz file are the same. I tried creating a local copy manually and it worked:
from pmlb import fetch_data

name = "feynman_III_12_43_copy"
dataset = fetch_data(name, local_cache_dir=f"./datasets/")
dataset```

gAldeia avatar Sep 13 '24 10:09 gAldeia

Hi @gAldeia Thank you for your reply. I am using :

pmlb==1.0.1.post3 Python 3.12.4

aminravanbakhsh avatar Sep 13 '24 18:09 aminravanbakhsh

@aminravanbakhsh Did you tried downloading the dataset locally and using the local_cache_dir to load it? It seems that your version 1.0.1.post3 was released in Sep 10, 2020, and the Feynman datasets were added just after July 2021 . Installing it locally by cloning the repo and performing pip install . should also solve your problem.

While this may be a workaround, ideally the PMLB should be updated at PyPI to its latest version.

Right now I am trying to submit new datasets, and there is this github action issue that is keeping me from actually doing it. If the local cache works I think we can close this issue and open a new one to update PyPI package to its latest version.

gAldeia avatar Sep 16 '24 00:09 gAldeia

Hi Guilherme, Thank you for your email. I fixed problem with downloading the data on my local computer. I think we could end the issue as you want. Please let me know if anything else is needed.

Sincerely, Amin

aminravanbakhsh avatar Sep 16 '24 15:09 aminravanbakhsh