data-diff icon indicating copy to clipboard operation
data-diff copied to clipboard

Snowflake connection timeout after successfull loading diff results into table

Open lokofoko opened this issue 1 year ago • 5 comments

Describe the bug I am using windows 11 last version and have data-diff v0.10.1. I am doing data-diff using a conf file between two tables in one schema in one db in Snowflake. It works perfectly up to the point, where it doesn't stop working after the successful creation of a table with results in Snowflake and instead throws an error message and does multiple retries to restore connection. To return control of the terminal I have to use Ctrl + C and wait from 30 sec to 2 minutes.

I had a chat with Sung in chat from the docs.datafold.com site and he advised me to create an issue here and mention @dlawin.

Make sure to include the following (minus sensitive information): 22:53:40 DEBUG Applied run configuration: {'verbose': False, 'database1': {'driver': 'snowflake', 'user': '', 'password': '', 'account': '', 'database': 'DBT_DEVELOPMENT', 'schema': 'DATA_INTEGRATIONS', 'warehouse': 'TEST_WH_XL', 'role': ''}, 'table1': 'COMPARING_UNUSEDDATA_PROD_TABLE', main.py:310 'database2': {'driver': 'snowflake', 'user': '', 'password': '', 'account': '', 'database': 'DBT_DEVELOPMENT', 'schema': 'DATA_INTEGRATIONS', 'warehouse': 'TEST_WH_XL', 'role': ''}, 'table2': 'COMPARING_UNUSEDDATA_DEV_TABLE'} 22:53:44 DEBUG Running SQL (Snowflake): base.py:962 ALTER SESSION SET TIMEZONE = 'UTC' DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM information_schema.columns WHERE table_name = 'COMPARING_UNUSEDDATA_PROD_TABLE' AND table_schema = 'DATA_INTEGRATIONS' DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM information_schema.columns WHERE table_name = 'COMPARING_UNUSEDDATA_DEV_TABLE' AND table_schema = 'DATA_INTEGRATIONS' Using algorithm 'joindiff'. main.py:519 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT * FROM (SELECT TRIM("APN_UNFORMATTED"), TRIM("UNIQUE_ID"), TRIM("APN_SEQUENCE_NUMBER"), TRIM("FIRST_MTG_LENDER_NAME"), TRIM("FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT"), TRIM("PROPERTY_CITY"), TRIM("JUNIOR_MTG_ASSIGNED_LENDER"), TRIM("FIRST_MTG_REFINANCE_LOAN_IND"), TRIM("FIRST_MTG_ARM_CHANGE_INTERVAL"), TRIM("FIRST_MTG_INTEREST_RATE_TYPE"), TRIM("ESTIMATED_EQUITY"), TRIM("DATAPROVIDERID"), TRIM("FIRST_MTG_ARM_NEXT_RESET_DATE"), TRIM("FIRST_MTG_AMOUNT"), TRIM("MAILING_CITY"), TRIM("PROPERTY_ADDRESS_FULL"), TRIM("JUNIOR_MTG_ORIGINATION_DATE"), TRIM("FIRST_MTG_ORIGINATION_DATE"), TRIM("OWNER_1_LAST_NAME"), TRIM("FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE"), TRIM("FIRST_MTG_LOAN_PURPOSE"), TRIM("FIRST_MTG_ASSIGNED_LENDER"), TRIM("FIRST_MTG_FIXED_RATE_LOAN_IND"), TRIM("FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT"), TRIM("FIRST_MTG_LTV"), TRIM("DOWNPMTASSISTFGRANT_IND"), TRIM("FIRST_MTG_ARM_INITIAL_RESET_DATE"), TRIM("JUNIOR_MTG_LENDER_NAME"), TRIM("FIRST_MTG_LOAN_TYPE"), TRIM("MAILING_ZIP_CODE"), TRIM("PROPERTY_ZIP_CODE"), TRIM("MAILING_ADDRESS_FULL"), TRIM("FIRST_MTG_ARM_CALCULATION_CHANGE"), TRIM("OWNER_2_FIRST_NAME"), TRIM("FIRST_MTG_MODIFIED_IND"), TRIM("FIRST_MTG_TERM"), TRIM("MAILING_STATE"), TRIM("REFERENCENR"), TRIM("FIRST_MTG_INTEREST_RATE"), TRIM("FIRST_MTG_ARM_INDEX_TYPE"), TRIM("FIRST_MTG_ARM_CHANGE_FREQ"), TRIM("FIRST_MTG_VARIABLE_RATE_LOAN_IND"), TRIM("JUNIOR_MTG_SUBORDINATE_TYPE"), TRIM("LANDUSEDESCRIPTIONID"), TRIM("LASTUSED"), TRIM("COUNTYID"), TRIM("TAX_AMOUNT"), TRIM("PROPERTY_STATE"), TRIM("FIPS"), TRIM("OWNER_2_LAST_NAME"), TRIM("JUNIOR_MTG_EQUITY_LOAN_IND"),
TRIM("COUNTY_NAME"), TRIM("JUNIOR_MTG_AMOUNT"), TRIM("OWNER_OCCUPIED_INDICATOR"), TRIM("OWNER_1_FIRST_NAME") FROM "COMPARING_UNUSEDDATA_PROD_TABLE") AS LIMITED_SELECT LIMIT 64 22:53:47 WARNING Mixed UUID/Non-UUID values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.UNIQUE_ID, disabling UUID support. base.py:1102 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_NEXT_RESET_DATE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.PROPERTY_ADDRESS_FULL. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_ORIGINATION_DATE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ASSIGNED_LENDER. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_INITIAL_RESET_DATE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.MAILING_ADDRESS_FULL. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.OWNER_2_FIRST_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_INDEX_TYPE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_CHANGE_FREQ. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_SUBORDINATE_TYPE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.OWNER_2_LAST_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.COUNTY_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_AMOUNT. It cannot be used as a key. base.py:1114 INFO [Snowflake] Schema = {'APN_UNFORMATTED': String_VaryingAlphanum(_notes=[]), 'UNIQUE_ID': String_VaryingAlphanum(_notes=[]), 'APN_SEQUENCE_NUMBER': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT': Text(_notes=[]), 'PROPERTY_CITY': String_VaryingAlphanum(_notes=[]), schema.py:12 'JUNIOR_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'MLG_SALE_PRICE': Decimal(_notes=[], precision=0), 'FIRST_MTG_REFINANCE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_CHANGE_INTERVAL': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_INTEREST_RATE_TYPE': String_VaryingAlphanum(_notes=[]), 'ESTIMATED_EQUITY': String_VaryingAlphanum(_notes=[]), 'DATAPROVIDERID': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_NEXT_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_AMOUNT': String_VaryingAlphanum(_notes=[]), 'MAILING_CITY': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ADDRESS_FULL': Text(_notes=[]), 'JUNIOR_MTG_ORIGINATION_DATE': Text(_notes=[]),
'FIRST_MTG_ORIGINATION_DATE': String_VaryingAlphanum(_notes=[]), 'OWNER_1_LAST_NAME': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE': Text(_notes=[]), 'FIRST_MTG_LOAN_PURPOSE': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'FIRST_MTG_FIXED_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT': Text(_notes=[]), 'FIRST_MTG_LTV': Text(_notes=[]), 'DOWNPMTASSISTFGRANT_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_INITIAL_RESET_DATE': Text(_notes=[]), 'JUNIOR_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_LOAN_TYPE': String_VaryingAlphanum(_notes=[]), 'MAILING_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'MAILING_ADDRESS_FULL': Text(_notes=[]), 'FIRST_MTG_ARM_CALCULATION_CHANGE': Text(_notes=[]), 'OWNER_2_FIRST_NAME': Text(_notes=[]), 'FIRST_MTG_MODIFIED_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_TERM': String_VaryingAlphanum(_notes=[]), 'MAILING_STATE': String_VaryingAlphanum(_notes=[]), 'REFERENCENR': Text(_notes=[]), 'FIRST_MTG_INTEREST_RATE': Text(_notes=[]), 'FIRST_MTG_ARM_INDEX_TYPE': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_FREQ': Text(_notes=[]), 'FIRST_MTG_VARIABLE_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'JUNIOR_MTG_SUBORDINATE_TYPE': Text(_notes=[]), 'LANDUSEDESCRIPTIONID': String_VaryingAlphanum(_notes=[]), 'LASTUSED': Text(_notes=[]), 'COUNTYID': Text(_notes=[]), 'TAX_AMOUNT': String_VaryingAlphanum(_notes=[]), 'PROPERTY_STATE': String_VaryingAlphanum(_notes=[]),
'FIPS': String_VaryingAlphanum(_notes=[]), 'OWNER_2_LAST_NAME': Text(_notes=[]), 'JUNIOR_MTG_EQUITY_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'COUNTY_NAME': Text(_notes=[]), 'JUNIOR_MTG_AMOUNT': Text(_notes=[]), 'OWNER_OCCUPIED_INDICATOR': String_VaryingAlphanum(_notes=[]), 'OWNER_1_FIRST_NAME': String_VaryingAlphanum(_notes=[])}
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT * FROM (SELECT TRIM("JUNIOR_MTG_SUBORDINATE_TYPE"), TRIM("REFERENCENR"), TRIM("PROPERTY_CITY"), TRIM("OWNER_2_LAST_NAME"), TRIM("JUNIOR_MTG_ASSIGNED_LENDER"), TRIM("UNIQUE_ID"), TRIM("JUNIOR_MTG_LENDER_NAME"), TRIM("FIRST_MTG_FIXED_RATE_LOAN_IND"), TRIM("OWNER_OCCUPIED_INDICATOR"), TRIM("PROPERTY_ZIP_CODE"), TRIM("FIRST_MTG_MODIFIED_IND"), TRIM("PROPERTY_ADDRESS_FULL"), TRIM("FIRST_MTG_ARM_CHANGE_FREQ"), TRIM("FIRST_MTG_LOAN_TYPE"), TRIM("ESTIMATED_EQUITY"), TRIM("DOWNPMTASSISTFGRANT_IND"), TRIM("MAILING_ZIP_CODE"), TRIM("MAILING_ADDRESS_FULL"), TRIM("MAILING_STATE"), TRIM("OWNER_2_FIRST_NAME"), TRIM("OWNER_1_LAST_NAME"), TRIM("LASTUSED"),
TRIM("MAILING_CITY"), TRIM("APN_UNFORMATTED"), TRIM("FIRST_MTG_LOAN_PURPOSE"), TRIM("FIRST_MTG_VARIABLE_RATE_LOAN_IND"), TRIM("FIRST_MTG_ARM_NEXT_RESET_DATE"), TRIM("FIRST_MTG_ASSIGNED_LENDER"), TRIM("FIPS"), TRIM("OWNER_1_FIRST_NAME"), TRIM("FIRST_MTG_ARM_INDEX_TYPE"), TRIM("JUNIOR_MTG_EQUITY_LOAN_IND"), TRIM("FIRST_MTG_LENDER_NAME"),
TRIM("COUNTY_NAME"), TRIM("APN_SEQUENCE_NUMBER"), TRIM("PROPERTY_STATE"), TRIM("FIRST_MTG_INTEREST_RATE_TYPE"), TRIM("FIRST_MTG_ARM_INITIAL_RESET_DATE"), TRIM("FIRST_MTG_REFINANCE_LOAN_IND") FROM "COMPARING_UNUSEDDATA_DEV_TABLE") AS LIMITED_SELECT LIMIT 64 22:53:49 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.JUNIOR_MTG_SUBORDINATE_TYPE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.OWNER_2_LAST_NAME. It cannot be used as a key. base.py:1114 WARNING Mixed UUID/Non-UUID values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.UNIQUE_ID, disabling UUID support. base.py:1102 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.JUNIOR_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.PROPERTY_ADDRESS_FULL. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_CHANGE_FREQ. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.MAILING_ADDRESS_FULL. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.OWNER_2_FIRST_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_LOAN_PURPOSE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_NEXT_RESET_DATE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ASSIGNED_LENDER. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_INDEX_TYPE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.COUNTY_NAME. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_INTEREST_RATE_TYPE. It cannot be used as a key. base.py:1114 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_INITIAL_RESET_DATE. It cannot be used as a key. base.py:1114 INFO [Snowflake] Schema = {'JUNIOR_MTG_SUBORDINATE_TYPE': Text(_notes=[]), 'JUNIOR_MTG_ORIGINATION_DATE': Date(_notes=[], precision=6, rounds=False), 'REFERENCENR': Text(_notes=[]), 'PROPERTY_CITY': String_VaryingAlphanum(_notes=[]), 'OWNER_2_LAST_NAME': Text(_notes=[]), 'FIRST_MTG_ARM_CALCULATION_CHANGE': Decimal(_notes=[], precision=5), schema.py:12 'JUNIOR_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT': Decimal(_notes=[], precision=5), 'FIRST_MTG_TERM': Decimal(_notes=[], precision=5), 'UNIQUE_ID': String_VaryingAlphanum(_notes=[]), 'LANDUSEDESCRIPTIONID': Decimal(_notes=[], precision=0), 'FIRST_MTG_ORIGINATION_DATE': Date(_notes=[], precision=6,
rounds=False), 'JUNIOR_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_FIXED_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'OWNER_OCCUPIED_INDICATOR': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_MODIFIED_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_INTEREST_RATE':
Decimal(_notes=[], precision=0), 'PROPERTY_ADDRESS_FULL': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_FREQ': Text(_notes=[]), 'FIRST_MTG_LOAN_TYPE': String_VaryingAlphanum(_notes=[]), 'ESTIMATED_EQUITY': String_VaryingAlphanum(_notes=[]), 'DOWNPMTASSISTFGRANT_IND': String_VaryingAlphanum(_notes=[]), 'MAILING_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'MAILING_ADDRESS_FULL': Text(_notes=[]), 'DATAPROVIDERID': Decimal(_notes=[], precision=0), 'MAILING_STATE': String_VaryingAlphanum(_notes=[]), 'COUNTYID': Decimal(_notes=[], precision=0), 'OWNER_2_FIRST_NAME': Text(_notes=[]), 'OWNER_1_LAST_NAME': String_VaryingAlphanum(_notes=[]), 'LASTUSED': Text(_notes=[]), 'MAILING_CITY': String_VaryingAlphanum(_notes=[]), 'APN_UNFORMATTED': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LOAN_PURPOSE': Text(_notes=[]), 'FIRST_MTG_VARIABLE_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_NEXT_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_INTERVAL': Decimal(_notes=[],
precision=5), 'FIRST_MTG_LTV': Decimal(_notes=[], precision=5), 'FIRST_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'TAX_AMOUNT': Float(_notes=[], precision=7), 'FIPS': String_VaryingAlphanum(_notes=[]), 'OWNER_1_FIRST_NAME': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_INDEX_TYPE': Text(_notes=[]), 'JUNIOR_MTG_EQUITY_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_AMOUNT': Decimal(_notes=[], precision=0), 'COUNTY_NAME': Text(_notes=[]), 'APN_SEQUENCE_NUMBER': String_VaryingAlphanum(_notes=[]), 'PROPERTY_STATE': String_VaryingAlphanum(_notes=[]), 'MLG_SALE_PRICE': Decimal(_notes=[], precision=0), 'JUNIOR_MTG_AMOUNT': Decimal(_notes=[], precision=0), 'FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE': Decimal(_notes=[], precision=5), 'FIRST_MTG_INTEREST_RATE_TYPE': Text(_notes=[]), 'FIRST_MTG_ARM_INITIAL_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_REFINANCE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT':
Decimal(_notes=[], precision=5)} DEBUG Running SQL (Snowflake-TL): base.py:179 DROP TABLE IF EXISTS "test_results_2024-01-31_22_53_44" DEBUG Running SQL (Snowflake-TL): base.py:179 SKIP DEBUG Testing for duplicate keys: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:230 INFO Validating that the are no duplicate keys in columns: ['unique_id'] for ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:243 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT count() AS "total", count(distinct coalesce(cast("UNIQUE_ID" as string), '')) AS "total_distinct" FROM "COMPARING_UNUSEDDATA_PROD_TABLE" DEBUG Collecting stats for table #1: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:270 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT sum("MLG_SALE_PRICE") AS "sum_mlg_sale_price", count() AS "count" FROM "COMPARING_UNUSEDDATA_PROD_TABLE" DEBUG Collecting stats for table #2: ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:270 DEBUG Querying for different rows: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:208 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT sum("MLG_SALE_PRICE") AS "sum_mlg_sale_price", sum("DATAPROVIDERID") AS "sum_dataproviderid", sum("FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT") AS "sum_first_mtg_arm_first_change_max_pct", sum("FIRST_MTG_TERM") AS "sum_first_mtg_term", sum("COUNTYID") AS "sum_countyid", sum("FIRST_MTG_ARM_CALCULATION_CHANGE") AS "sum_first_mtg_arm_calculation_change", sum("FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE") AS "sum_first_mtg_arm_maximum_interest_rate", sum("LANDUSEDESCRIPTIONID") AS "sum_landusedescriptionid", sum("TAX_AMOUNT") AS "sum_tax_amount", sum("FIRST_MTG_INTEREST_RATE") AS "sum_first_mtg_interest_rate", sum("FIRST_MTG_ARM_CHANGE_INTERVAL") AS "sum_first_mtg_arm_change_interval", sum("FIRST_MTG_LTV") AS "sum_first_mtg_ltv", sum("FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT") AS "sum_first_mtg_arm_change_percent_limit", sum("JUNIOR_MTG_AMOUNT") AS "sum_junior_mtg_amount", sum("FIRST_MTG_AMOUNT") AS "sum_first_mtg_amount", count() AS "count" FROM "COMPARING_UNUSEDDATA_DEV_TABLE" DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT * FROM .... INFO Validating that the are no duplicate keys in columns: ['unique_id'] for ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:243 DEBUG Done collecting stats for table #1: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:306 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT count() AS "total".... "COMPARING_UNUSEDDATA_DEV_TABLE" DEBUG Testing for null keys: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:252 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960 SELECT "UNIQUE_ID" FROM "COMPARING_UNUSEDDATA_PROD_TABLE" WHERE ("UNIQUE_ID" IS NULL) DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT "UNIQUE_ID" FROM "COMPARING_UNUSEDDATA_DEV_TABLE" WHERE ("UNIQUE_ID" IS NULL) 22:53:50 DEBUG Counting exclusive rows: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:372 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT count(*) FROM .... 22:53:52 DEBUG Done collecting stats for table #2: ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:306 DEBUG Counting differences per column: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:346 DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960 SELECT .... 22:55:26 DEBUG Running SQL (Snowflake-TL): base.py:179 CREATE TABLE IF NOT EXISTS "test_results_2024-01-31_22_53_44" ..... DEBUG Running SQL (Snowflake-TL): base.py:179 SKIP DEBUG Running SQL (Snowflake-TL): base.py:179 INSERT INTO "test_results_2024-01-31_22_53_44" .... AS LIMITED_SELECT LIMIT 50000000 22:56:46 DEBUG Running SQL (Snowflake-TL): base.py:179 SKIP [1] 99 segmentation fault data-diff --conf /c/Users/Vitalii/WorkRepos/data_diff_configs/datadiff.toml If possible, please paste these as text, and not a screenshot.

Describe the environment

Describe which OS you're using, which data-diff version, and any other information that might be relevant to this bug.

lokofoko avatar Feb 01 '24 10:02 lokofoko

I provided a bit shortened version of logs, because it goes beyond character limit

lokofoko avatar Feb 01 '24 10:02 lokofoko

I have to add that comparing tables are around 50mil rows each, it works as expected with smaller size

lokofoko avatar Feb 01 '24 16:02 lokofoko

Hey There!

Thanks again for providing the context. I want to apologize upfront that I thought @dlawin had capacity to investigate this but he doesn't. I don't have enough context to help you further either. I'm happy to evolve this conversation with you if you're interested in trying out Datafold Cloud as we use a different algorithm and don't have this hanging timeout issue like you're experiencing.

OR we welcome community pull requests if you have capacity to dig deeper on your own.

sungchun12 avatar Feb 01 '24 20:02 sungchun12

I am now thinking that this is due to the fact, that it was trying to put all the diff data in my terminal too, I think with the stats flag it should be fixed but I have no opportunity to test it right now.

lokofoko avatar Feb 02 '24 10:02 lokofoko

It works with stats flag just fine. Hope you can add it somewhere in documentation.

lokofoko avatar Feb 21 '24 10:02 lokofoko

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

github-actions[bot] avatar Apr 22 '24 06:04 github-actions[bot]

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.

github-actions[bot] avatar Apr 29 '24 06:04 github-actions[bot]