redshift_plugin icon indicating copy to clipboard operation
redshift_plugin copied to clipboard

s3_to_redshift fix json iterators

Open szeevs opened this issue 7 years ago • 0 comments

Accessing keys and values of the json schema is incorrect. Change the code to:

def create_if_not_exists(self, schema, pg_hook, temp=False):
        output = ''
        for item in schema:
            k = "{quote}{key}{quote}".format(quote='"', key=item)
            field = ' '.join([k, schema[item]])
            if isinstance(self.sortkey, str) and self.sortkey == item:
                field += ' sortkey'
            output += field
            output += ', '
            ....

And also here:

def reconcile_schemas(self, schema, pg_hook):
        pg_query = \
            """
            SELECT column_name, udt_name
            FROM information_schema.columns
            WHERE table_schema = '{0}' AND table_name = '{1}';
            """.format(self.redshift_schema, self.table)
        pg_schema = dict(pg_hook.get_records(pg_query))
        incoming_keys = [column for column in schema]
        diff = list(set(incoming_keys) - set(pg_schema.keys()))
        print('diff {}'.format(diff))
        ....

szeevs avatar Mar 22 '18 08:03 szeevs