vectorflow icon indicating copy to clipboard operation
vectorflow copied to clipboard

Implement Log Aggregation & Searching

Open dgarnitz opened this issue 2 years ago • 2 comments

VectorFlow has many logs spread of over different containers. We need these logs to be aggregated into a searchable form.

One option could be to use Kibana with Elastic Search. If the logs have metrics in them, we may want to put those into prometheus

dgarnitz avatar Oct 31 '23 05:10 dgarnitz

Scope this out. We need this upgrade to solve this issue.

What to Build

  • Add an additional database table that stores error messages. Do this by adding a new model object. Make sure that the job_id and batch_id are tracked
  • For now this can store the entire stacktrace, as long as its not too long,
  • Alter the code in the api, worker and extractor so that whenever an error is logged, it is also saved to the database. add a method to a utils file, save_error() that does this and use the util method in each file
  • Add an endpoint that returns all the errors for a given batch_id. return a JSON object with a field errors that is an array of stack traces
  • Add an endpoint that returns all the errors for a given job_id. return a JSON object with a field errors that is a dictionary featuring batch_id as the key and an array of stack traces as the value

dgarnitz avatar Apr 04 '24 18:04 dgarnitz

No need for kibana or prometheus, just store in the DB

dgarnitz avatar Apr 08 '24 20:04 dgarnitz