singer-python icon indicating copy to clipboard operation
singer-python copied to clipboard

Support for inline configuration strings

Open dcereijodo opened this issue 6 years ago • 6 comments

Sometimes when productionalizing taps and targets executions it gets inconvenient having to rely on an actual configuration file stored in the system, and instead it would be much easier to be able to pass such config as a JSON string in the command line parameters. So something like this

tap-mysql --config '{"host": "mysql-host.com", "port": "3306", "user": "$USR_PROD", "password": "$PWD_PROD"}'

One simple hack for supporting that would be changing this line to something like this

def load_json(path):
  try:
      inline_config = json.loads(myjson)
  except ValueError as e:
    with open(path) as fil:
      return json.load(fil)
  return inline_config

dcereijodo avatar Sep 21 '19 15:09 dcereijodo

Couldn't you do:

echo '{"host": "mysql-host.com", \                                                                                                                                                                                   
       "port": "3306", \                                                                                                                                                                                             
       "user": "$USR_PROD", \                                                                                                                                                                                        
       "password": "$PWD_PROD"}'                                                                                                                                                                                     
     > /tmp/my_config; \                                                                                                                                                                                             
tap-mysql --config /tmp/my_config

luandy64 avatar Sep 23 '19 13:09 luandy64

I totally can (at least in my use case). My suggestion was more about the convenience of it. When I have to do that inside an Airflow BashOperator for example, it starts looking odd

BashOperator(
  task_id='syn_animals',
  bash_command="""
    echo '{"host": "mysql-host.com", \                                                                                                                                                                                   
       "port": "3306", \                                                                                                                                                                                             
       "user": "$USR_PROD", \                                                                                                                                                                                        
       "password": "$PWD_PROD"}'                                                                                                                                                                                     
     > /tmp/my_config;                                                                                                                                          
   tap-mysql --config /tmp/my_config
  """
)

Also I might not have a good place (or a safe place) to put that temporal config file.

In this custom tap I experimented with an optional parameter --overrides which accepts a JSON string that gets merged into a default (or empty otherwise) configuration file. That would also work here.

BashOperator(
  task_id='syn_animals',
  bash_command="""
    tap-mysql --overrides '{"host": "mysql-host.com", "port": "3306", "user": "$USR_PROD", "password": "$PWD_PROD"}'
  """
)

But again it's just a matter of convenience :)

dcereijodo avatar Sep 23 '19 15:09 dcereijodo

Understood, feel free to open a PR for the feature 👍

On another note, how do you handle state files? Or do you not use those?

luandy64 avatar Sep 23 '19 20:09 luandy64

Great. Will do that :)

No, I do not need state files. But if I were to implement that, what's wrong with using the same approach? So

BashOperator(
  task_id='syn_animals',
  bash_command="""
    tap-mysql \
      --config ~/.singer.io/tap_mysql_defaults.json \ # staging and prod configuration is identical, only the host changes
      --overrides '{"host": "$MYSQL_HOST"}' \
      --state '{"type": "STATE",  "value": {"bookmarks": {"db-animals": {"version": 1509135204169, "replication_key_value": "{{ ts }}", "replication_key": "updated_at"}}, "currently_syncing": null}}' \
    | target-redshift ...
  """
)

dcereijodo avatar Sep 24 '19 06:09 dcereijodo

Nothing wrong with it 😄 It occurred to me that either you are not using state at all or we would have to also implement this for state

luandy64 avatar Sep 24 '19 14:09 luandy64

Right. I sent a PR that implements this behavior in the load_json function from the utils, so it should work for config, state and properties arguments.

dcereijodo avatar Sep 24 '19 15:09 dcereijodo