LAPIS icon indicating copy to clipboard operation
LAPIS copied to clipboard

Develop preprocessing pipeline

Open chaoran-chen opened this issue 2 years ago • 3 comments

For

  • [ ] SC2 open - it's almost finished but a few minor adjustments are needed
  • [x] GISAID

chaoran-chen avatar Sep 13 '23 07:09 chaoran-chen

  • The pipeline downloads the data from the servers
  • The output is a ndjson file with the data

JonasKellerer avatar Sep 13 '23 07:09 JonasKellerer

Bug detected by @Taepper: there is one sequence in the open dataset where strain is null. I need to fix it..

chaoran-chen avatar Sep 18 '23 14:09 chaoran-chen

Sorry this is a stupid bug on our end, we should just exclude that line from the metadata in ncov ingest. But yeah, this won't happen quickly

corneliusroemer avatar Sep 19 '23 12:09 corneliusroemer

We have ingest pipelines for both datasets.

fengelniederhammer avatar May 29 '24 09:05 fengelniederhammer