versatile-data-kit icon indicating copy to clipboard operation
versatile-data-kit copied to clipboard

vdk-oracle: infer case-insesitive keys from columns when ingesting payloads

Open DeltaMichael opened this issue 1 year ago • 1 comments

Follow-up from https://github.com/vmware/versatile-data-kit/pull/3194

Overview

The implementation in https://github.com/vmware/versatile-data-kit/pull/3194 fetches all the column names and looks for them in the payload keys. The problem is that Oracle is case-insensitive by default and returns column names capitalized. We end up having to call lower() on the returned column names, which is not ideal.

For example, we could have a payload like

    payload = {
        "iD": "5",
        "Str_Data": "string",
        "INT_data": 12,
    }

But the columns we'd get when querying would be

ID,
STR_DATA,
INT_DATA

So we have to find a way to match between the two. Worse even, we could have something like this

    payload = {
        "iD": "5",
        "int_DATA": 11,
        "INT_data": 12,
    }

These should technically be two different columns.

One possible solution is to make all column names case-sensitive, by putting them in quotes. We already do this to escape special characters and we can extend it to all column names. This should be covered by functional tests.

Acceptance criteria

  1. Support the above scenario
  2. Functional tests added for above scenario

DeltaMichael avatar Mar 18 '24 08:03 DeltaMichael

Quick fix: Throw error when you get a non-lowercase key.

DeltaMichael avatar Mar 19 '24 10:03 DeltaMichael