join-order-benchmark issues

cannot download file from step 2

3

![image](https://user-images.githubusercontent.com/37615060/160579786-c2d0d04f-88ee-4592-bdab-c761261f98ed.png)

CrystalGuo0312

Availability of a Bigger Dataset

The current Dataset is only 2-3Gb. Is there a bigger data set for 2TB or plans to make a new one

jaystarshot

Data set mismatch: empty queries results and reproducibility issues

7

We execute the join order benchmark as one of our "default" benchmarks in [Hyrise](https://git.io/hyrise). We recently found that several queries yield empty results and wondered if this might be a...

Bouncner

Malformed CSVs

1

Downloaded CSV tarball. Trying to upload to SQL Azure using bcp proved to be really hard as CSVs are malformed. Sample CSV row in aka_name.csv 220222,538021,"\"Borolas\", Joaquín García Vargas",,B6425,J2526,B642,6526774f1ce04414f56476409ce59060 CSV...

chsalgado

Step 4 taking too long

2

My code has been running at step 4: "transform *gz files to relational schema (takes a while)" for 12 consecutive hours. Is this normal and expected? Thank you

minhduchoang301

Help

When I run the imdbpy2sql.py, I get some problems. AttributeError: 'module' object has no attribute 'getLogger' ![A`TXE(GWZHYY@RG)W~HB4)W](https://user-images.githubusercontent.com/33775417/114302440-9a3ba100-9afb-11eb-8155-f50f3d657412.png)

haihsjjxsa

Merge All Individual Queries into final files

These final files can be useful for Jmeter and other benchmarking tools. I also have around 2GB of data for each table which I can contribute as zip if needed

jaystarshot

Can't transform *gz files to postgresql

3

Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.9/bin/imdbpy2sql.py", line 505, in conn = setConnection(URI, DB_TABLES) File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/imdb/parser/sql/alchemyadapter.py", line 455, in setConnection engine = create_engine(uri, **params) File "", line 2, in create_engine...

Merzouk-Ilyes

Update readme to note about potential issues with "frozen data set"

Short PR to make sure people are aware that the queries assume the "original data set". It's been quite a pain for us to realize it and other had similar...

Bouncner

Is there a list of database projects/products that perform a join order benchmark?

2

I only see cedarDB https://cedardb.com/docs/example_datasets/job/. However it's not a comparison to another database.

alberttwong

join-order-benchmark
join-order-benchmark copied to clipboard

Metadata

cannot download file from step 2

Availability of a Bigger Dataset

Data set mismatch: empty queries results and reproducibility issues

Malformed CSVs

Step 4 taking too long

Help

Merge All Individual Queries into final files

Can't transform *gz files to postgresql

Update readme to note about potential issues with "frozen data set"

Is there a list of database projects/products that perform a join order benchmark?

← Metadata

Owner

Metadata

join-order-benchmark join-order-benchmark copied to clipboard

Metadata

← Metadata

Owner

Metadata

join-order-benchmark
join-order-benchmark copied to clipboard