dbt_shopify_source icon indicating copy to clipboard operation
dbt_shopify_source copied to clipboard

[Feature] Adding new schema dynamically

Open advolut-team opened this issue 1 year ago • 4 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Describe the issue

Hi Fivetran team,

First off, thank you for the amazing work in maintaining this repo - it has been very useful :)

Problem:

I'd like to run a DBT transformation for more than 50 Shopify shops using the same dbt_shopify repo, which uses this repo as a "source" package. I understand that this feature is already available, but every time a new Shopify store signs up via the Fivetran Connect Card flow, I'll need to append the shopify_union_schemas variable, which is already a long list.

So, I'd like to read the shopify_schema dynamically whenever a new Shopify store signs up to my website.

Attempted Workaround:

  1. For each Shopify store/connector, I have created a DBT project using dbt_shopify repo with an additional environment variable DBT_SCHEMA_NAME=shopify_store_name
  2. In my dbt_project.yml, I modified shopify_schema to read the environment variable DBT_SCHEMA_NAME, like below:
vars: 
    shopify_database: dev
    shopify_schema: "{{ env_var('DBT_SCHEMA_NAME') }}"
  1. I found out that dbt only reads project variables literally, so jinja macros don't work (I'd love to be proven wrong on this).

Request:

Ideal: Modify dbt_packages/shopify_source/models/src_shopify.yml in your code. I'm happy to send a PR.

sources:
  - name: shopify # This source will only be used if you are using a single Shopify source connector. If multiple sources are being unioned, their tables will be directly referenced via adatper.get_relation
    schema: "{{ env_var('DBT_SCHEMA_NAME') or var('shopify_schema', 'shopify')  }}"

I'm also keen for any recommendations if you think my approach can be improved.

Relevant error log or model output

08:11:09  Running with dbt=1.7.17
08:11:09  Registered adapter: redshift=1.7.7
08:11:09  Unable to do partial parsing because a project config has changed
08:11:13  Found 92 models, 98 tests, 34 sources, 0 exposures, 0 metrics, 869 macros, 0 groups, 0 semantic models
08:11:13  
08:11:16  Concurrency: 1 threads (target='dev')
08:11:16  
08:11:16  1 of 81 START sql table model env_var2_shopify.shopify__calendar ............... [RUN]
08:11:18  1 of 81 OK created sql table model env_var2_shopify.shopify__calendar .......... [SUCCESS in 2.13s]
08:11:18  2 of 81 START sql view model env_var2_stg_shopify.stg_shopify__abandoned_checkout_discount_code_tmp  [RUN]
08:11:18  

Please be aware: The ABANDONED_CHECKOUT_DISCOUNT_CODE table was not found in your SHOPIFY schema(s). The Fivetran dbt package will create a completely empty ABANDONED_CHECKOUT_DISCOUNT_CODE staging model as to not break downstream transformations. To turn off these warnings, set the `fivetran__remove_empty_table_warnings` variable to TRUE (see https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source for details).

Expected behavior

Expected end result in Redshift: Screenshot 2024-07-29 at 6 05 05 PM

Assuming the Shopify store is env_var2, this repo should generate env_var2_stg_shopify, and the dbt_shopify repo will generate env_var2_shopify. The env_var2_final schema is the "final" serving layer that has the derived tables from both repos. There will be 4 schemas in total.

dbt Project configurations


# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: 'dbt_tutorial'
version: '1.0.0'
config-version: 2

# This setting configures which "profile" dbt uses for this project.
profile: 'dbt_tutorial'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets:         # directories to be removed by `dbt clean`
  - "target"
  - "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

vars: # must add
    shopify_database: dev
    shopify_schema: dummy_value # input schema, overrides schema field in profiles.yml

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
  dbt_tutorial:
    # Config indicated by + and applies to all files under models/example/
    example:
      +schema: final
      +materialized: view

Package versions

packages:
- package: fivetran/shopify
  version: [">=0.12.0", "<0.13.0"] 

What database are you using dbt with?

redshift

dbt Version

Core:
  - installed: 1.7.17
  - latest:    1.8.4  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - redshift: 1.7.7  - Update available!
  - postgres: 1.7.17 - Update available!

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • [ ] Yes.
  • [X] Yes, but I will need assistance and will schedule time during our office hours for guidance
  • [ ] No.

advolut-team avatar Jul 29 '24 08:07 advolut-team