David Tulga
David Tulga
This article may be helpful in the future, as it talks about pyarrow's support for JSON: https://arrow.apache.org/docs/python/generated/pyarrow.json.read_json.html
Originally, I was using this script: https://github.com/iterative/dvcx-server/blob/b232559d773dcee8cadc9f1ac8730c0856b94ff8/clickhouse-db-adapter/scripts/run_with_distributed_workers.py to run the tests with distributed workers, but this has been changed a few times since then. Now they are supposed to be...
I have been using that script for local debugging, yes. And I don't see an obvious fix for this particular kind of Celery issue either, but I'll think of possible...
This article may be helpful for a future structured export function: https://huggingface.co/docs/datasets/en/repository_structure