Sqlite support for online_store using Google Storage remote path
Is your feature request related to a problem? Please describe.
I am a first-time user of Feast. I am trying to set up my feature_store.yaml file by setting
...
registry: gs://my_bucket/feast/data/registry.db
provider: gcp
online_store:
type: sqlite
path: gs://my_bucket/feast/data/online_store.db
...
When running feast apply, I got the following error:
Traceback (most recent call last):
File "/Users/edwardcui/opt/anaconda3/envs/feast/bin/feast", line 8, in <module>
sys.exit(cli())
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
return self.main(*args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/cli.py", line 532, in apply_total_command
apply_total(repo_config, repo, skip_source_validation)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
return func(*args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/repo_operations.py", line 335, in apply_total
apply_total_with_repo_instance(
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/repo_operations.py", line 312, in apply_total_with_repo_instance
store.apply(all_to_apply, objects_to_delete=all_to_delete, partial=False)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
return func(*args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/feature_store.py", line 991, in apply
self._get_provider().update_infra(
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/passthrough_provider.py", line 121, in update_infra
self.online_store.update(
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 299, in wrapper
raise exc.with_traceback(traceback)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/usage.py", line 288, in wrapper
return func(*args, **kwargs)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 200, in update
conn = self._get_conn(config)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 77, in _get_conn
self._conn = _initialize_conn(db_path)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/site-packages/feast/infra/online_stores/sqlite.py", line 245, in _initialize_conn
Path(db_path).parent.mkdir(exist_ok=True)
File "/Users/edwardcui/opt/anaconda3/envs/feast/lib/python3.9/pathlib.py", line 1323, in mkdir
self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/Users/edwardcui/Documents/Scripts/mlads_praime_common/feast/feature_repo/gs:/my_bucket/feast/data'
This error seems to be related to this line of the code: https://github.com/feast-dev/feast/blob/6aa54aa76f1db52a710aad0caceb3531a88fe802/sdk/python/feast/infra/online_stores/sqlite.py#L68
It appears pathlib does not recognize the path I specified on Google Storage as an absolute path.
I tried to "hack" the source code by forcing db_path to be set to the Google Storage path, but then the subsequent step attempted to create the file via pathlib and it could not create the sqlite database on Google Storage. However, specifying the registry as a remote GS path successfully created the sqlite database.
Describe the solution you'd like
I would like feast online_store to support sqlite database on GCP or a remote location rather than local location. Remote sqlite registry.db is already supported. It would allow faster initial setup and debugging before moving to the more scalable solutions like BigTable or Redis. A simple remote shared location for online_store would allow other contributors on our team to help explore feast quickly as well.
Describe alternatives you've considered
Additional context
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
sqlite itself doesn't support storing the database file in a remote storage. I guess you could have some workarounds like mounting it somehow or copying the file to local first before querying, but that's just too much hassle and not worth the effort.