vectorflow
vectorflow copied to clipboard
Implement Log Aggregation & Searching
VectorFlow has many logs spread of over different containers. We need these logs to be aggregated into a searchable form.
One option could be to use Kibana with Elastic Search. If the logs have metrics in them, we may want to put those into prometheus
Scope this out. We need this upgrade to solve this issue.
What to Build
- Add an additional database table that stores error messages. Do this by adding a new model object. Make sure that the
job_idandbatch_idare tracked - For now this can store the entire stacktrace, as long as its not too long,
- Alter the code in the
api,workerandextractorso that whenever an error is logged, it is also saved to the database. add a method to a utils file,save_error()that does this and use the util method in each file - Add an endpoint that returns all the errors for a given
batch_id. return a JSON object with a fielderrorsthat is an array of stack traces - Add an endpoint that returns all the errors for a given
job_id. return a JSON object with a fielderrorsthat is a dictionary featuringbatch_idas the key and an array of stack traces as the value
No need for kibana or prometheus, just store in the DB