feast icon indicating copy to clipboard operation
feast copied to clipboard

Sqlite support for online_store using Google Storage remote path

Open EdwardCuiPeacock opened this issue 2 years ago • 2 comments

Is your feature request related to a problem? Please describe. I am a first-time user of Feast. I am trying to set up my feature_store.yaml file by setting

...
registry: gs://my_bucket/feast/data/registry.db
provider: gcp
online_store:
  type: sqlite
  path: gs://my_bucket/feast/data/online_store.db
...

When running feast apply, I got the following error:

Traceback (most recent call last):
  File "/Users/edwardcui/opt/anaconda3/envs/feast/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/cli.py", line 532, in apply_total_command
    apply_total(repo_config, repo, skip_source_validation)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
    return func(*args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/repo_operations.py", line 335, in apply_total
    apply_total_with_repo_instance(
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/repo_operations.py", line 312, in apply_total_with_repo_instance
    store.apply(all_to_apply, objects_to_delete=all_to_delete, partial=False)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
    return func(*args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/feature_store.py", line 991, in apply
    self._get_provider().update_infra(
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/passthrough_provider.py", line 121, in update_infra
    self.online_store.update(
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 299, in wrapper
    raise exc.with_traceback(traceback)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
    return func(*args, **kwargs)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 200, in update
    conn = self._get_conn(config)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 77, in _get_conn
    self._conn = _initialize_conn(db_path)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 245, in _initialize_conn
    Path(db_path).parent.mkdir(exist_ok=True)
  File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/pathlib.py", line 1323, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/edwardcui/Documents/Scripts/mlads_praime_common/feast/feature_repo/gs:/my_bucket/feast/data'

This error seems to be related to this line of the code: https://github.com/feast-dev/feast/blob/6aa54aa76f1db52a710aad0caceb3531a88fe802/sdk/python/feast/infra/online_stores/sqlite.py#L68

It appears pathlib does not recognize the path I specified on Google Storage as an absolute path.

I tried to "hack" the source code by forcing db_path to be set to the Google Storage path, but then the subsequent step attempted to create the file via pathlib and it could not create the sqlite database on Google Storage. However, specifying the registry as a remote GS path successfully created the sqlite database.

Describe the solution you'd like

I would like feast online_store to support sqlite database on GCP or a remote location rather than local location. Remote sqlite registry.db is already supported. It would allow faster initial setup and debugging before moving to the more scalable solutions like BigTable or Redis. A simple remote shared location for online_store would allow other contributors on our team to help explore feast quickly as well.

Describe alternatives you've considered

Additional context

EdwardCuiPeacock avatar Apr 26 '23 14:04 EdwardCuiPeacock

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 17 '23 14:09 stale[bot]

sqlite itself doesn't support storing the database file in a remote storage. I guess you could have some workarounds like mounting it somehow or copying the file to local first before querying, but that's just too much hassle and not worth the effort.

tokoko avatar Apr 10 '24 09:04 tokoko