join-order-benchmark
join-order-benchmark copied to clipboard
Join Order Benchmark (JOB)

The current Dataset is only 2-3Gb. Is there a bigger data set for 2TB or plans to make a new one
We execute the join order benchmark as one of our "default" benchmarks in [Hyrise](https://git.io/hyrise). We recently found that several queries yield empty results and wondered if this might be a...
Downloaded CSV tarball. Trying to upload to SQL Azure using bcp proved to be really hard as CSVs are malformed. Sample CSV row in aka_name.csv 220222,538021,"\"Borolas\", Joaquín García Vargas",,B6425,J2526,B642,6526774f1ce04414f56476409ce59060 CSV...
My code has been running at step 4: "transform *gz files to relational schema (takes a while)" for 12 consecutive hours. Is this normal and expected? Thank you
When I run the imdbpy2sql.py, I get some problems. AttributeError: 'module' object has no attribute 'getLogger' 
These final files can be useful for Jmeter and other benchmarking tools. I also have around 2GB of data for each table which I can contribute as zip if needed
Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.9/bin/imdbpy2sql.py", line 505, in conn = setConnection(URI, DB_TABLES) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/imdb/parser/sql/alchemyadapter.py", line 455, in setConnection engine = create_engine(uri, **params) File "", line 2, in create_engine...
Short PR to make sure people are aware that the queries assume the "original data set". It's been quite a pain for us to realize it and other had similar...
I only see cedarDB https://cedardb.com/docs/example_datasets/job/. However it's not a comparison to another database.