holoclean
holoclean copied to clipboard
A Machine Learning System for Data Enrichment.
Bumps [numpy](https://github.com/numpy/numpy) from 1.16.1 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...
I have read the paper of holoclean, and I want to run an example in the flight dataset as my baseline. I found that the flight dataset is something different...
Hi, This code base contains a linear model for the repair task; while the paper mentions Factor graph model. Does the linear model achieve the same performance as the factor...
It's not compatible with windows, will you do a compatibility?
Hi, this might be a stupid question, but during a project that I am working on, I used Holoclean for some data repair task. I found that, however, there's no...
Hi, pos_values are used in get_infer_dataframes() method in repair.py and it takes values using SQL query and not from DF. Since the size of pos_values is large, we can get...
Hi, According to the code, the tensor created is `tensor[0][init_idx][attr_idx] = 1.0`. Can you please elaborate on why its not `tensor[0][attr_idx][init_idx] = 1.0`(axis 1 and 2 swapped). file : repair/featurize/initattfeat.py@[12](https://github.com/HoloClean/holoclean/blob/bf0287bf14d3a798c405908b5354ca7e06b5e56c/repair/featurize/initattfeat.py#L12)...
Improving performance while saving data in Postgresql in case of large dataset. to_sql() method is slow and takes times to save data in Postgres. It is replaced with copy_exprt() to...