api icon indicating copy to clipboard operation
api copied to clipboard

Query crawler

Open kompotkot opened this issue 3 years ago • 3 comments

Changes

Fetch query from journal, validate query, execute query and push to bucket if required.

How to test these changes?

Tested locally

Related issues

kompotkot avatar Oct 25 '22 09:10 kompotkot

@bugout-dev check

  • [ ] Add env MOONSTREAM_S3_DATA_BUCKET
  • [ ] Add env MOONSTREAM_S3_DATA_BUCKET_PREFIX

kompotkot avatar Oct 25 '22 10:10 kompotkot

We need to cherry pick from this PR later.

kompotkot avatar Oct 25 '22 11:10 kompotkot

Although we will not use queries_crawler to crawl public data, we will use it as a replacement for the existing Query API data producer (which is currently in mooncrawl/stats_worker/queries.py.

The queries_crawler should be a CLI which exposes the same functionality but in a more modular way (and should be invoked from systemd on prod).

We will revisit this after our current batch of urgent work.

zomglings avatar Oct 25 '22 11:10 zomglings