list_tasks() return less than shown on the website
Description
I'm trying to list all the tasks in OpenML database. I tried to use task_list = openml.tasks.list_tasks() but it only return a list of length 46779. I saw on the OpenML official website there are 261.0k tasks. Is there any APIs that can help me to get all these tasks?
I also tried to add task_type like task_list = openml.tasks.list_tasks(openml.tasks.TaskType.SUPERVISED_REGRESSION), the returned task are still less than the filtered result on website. I only get 3939 supervised classification tasks but the website shows 4345. I only get 2600 supervised regression tasks but the website shows 19459.
Steps/Code to Reproduce
import openml
task_list = openml.tasks.list_tasks()
print(task_list)
Expected Results
task_list contains all the 261.0k task_id and infos.
Actual Results
It only contains 46779 tasks.
Heyho,
Thanks for pointing this out.
This might be a server problem, or the numbers on the website might be wrong.
The API, which list_tasks calls, also only returns 2600 entries (https://api.openml.org/api/v1/json/task/list/type/2).
@PGijsbers do you know more about this?
No, I was under the impression that the website internally also uses the same API to get their data (+ elastic search), so based on that I can't explain the discrepancy. @joaquinvanschoren ?