Apollo icon indicating copy to clipboard operation
Apollo copied to clipboard

Serve Remote urlTemplates for Organism

Open ctcncgr opened this issue 6 years ago • 10 comments

I would like to serve block compresseed fasta from remote urls.

A pretty minimal config like:

{
    "refSeqs": "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.fai",
    "tracks": [
        {
            "storeClass": "JBrowse/Store/SeqFeature/IndexedFasta",
            "type": "Sequence",
            "label": "test remote fasta",
            "urlTemplate": "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz"
        }
    ]
}

Seems to work ok in JBrowse (with some minor oddities decoding the actual sequence bases), however when I try to point apollo to the trackList.json file I get the following error in the log:

at java.lang.Thread.run(Thread.java:748)
2019-08-14 19:47:51,764 [http-nio-8080-exec-3] ERROR apollo.OrganismController
- problem saving organism: java.lang.NullPointerException: Cannot get property '
storeClass' on null object

This is public data and a gzi file exists for the block compressed fasta.

Can remote urlTemplates be used with Apollo? Am I just screwing something up?

Thank you

ctcncgr avatar Aug 14 '19 20:08 ctcncgr

If you are using BgzipIndexedFasta there is a special storeclass

"storeClass": "JBrowse/Store/SeqFeature/BgzipIndexedFasta"

This might help with the oddities...let me know if it does

cmdcolin avatar Aug 14 '19 20:08 cmdcolin

@ctcncgr My guess is that its a bug on my part, but please try what @cmdcolin suggested above first and see what you get. Thanks.

nathandunn avatar Aug 14 '19 23:08 nathandunn

Hey Colin,

Thanks for the rapid response. Setting the above storeClass did indeed fix my jbrowse sequence "oddities".

It additionally allowed me to locate this:

https://github.com/elsiklab/gccontent/issues/8

Which outlines a nice approach for serving the gzi, fai and bgz fasta.

Sorry I had missed this when searching earlier.

the config below seems to work and loads and renders the sequence:

{
  "refSeqs": "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.fai",
  "tracks": [
    {
         "category" : "Reference sequence",
         "faiUrlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.fai",
         "gziUrlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.gzi",
         "key" : "Reference sequence remote",
         "label" : "DNA",
         "seqType" : "dna",
         "storeClass" : "JBrowse/Store/SeqFeature/BgzipIndexedFasta",
         "type" : "SequenceTrack",
         "urlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz",
         "useAsRefSeqStore" : 1
      }
  ]
}

Thank you for your help, everything seems to be working.

ctcncgr avatar Aug 14 '19 23:08 ctcncgr

Ok I may have spoke too soon. I can load everything successfully in apollo (it gives the correct reference number and all the lengths, and I get no errors in the apollo log), but when I try to switch chromosomes the reference sequence track display breaks (disappears completely). Also when I go to the "tracks" tab I get an exception from annotator-0.js:3602: 'Cannot read property "a" of null', and no tracks appear. The drop-down for the reference works and is populated with the correct ids. The same error occurs when I try to double-click a reference sequence from the "Ref Sequence" tab.

The behavior is the same if the files are served locally from the directory containing trackList.json.

annotator-0.js:3602 Uncaught TypeError: Cannot read property 'a' of null
    at USb (annotator-0.js:3602)
    at VSb (annotator-0.js:2013)
    at Xh (annotator-0.js:1711)
    at $h (annotator-0.js:2298)
    at eval (annotator-0.js:3029)
    at VM1247 1.bundle.js:1
    at d (VM1247 1.bundle.js:1)
    at VM1247 1.bundle.js:1

Thank you.

ctcncgr avatar Aug 15 '19 00:08 ctcncgr

@ctcncgr its probably a problem in the trackList.json that we're not properly checking.

Can you paste the trackList.json and I'll either point out the error and/or fix the code?

nathandunn avatar Aug 15 '19 03:08 nathandunn

Hey @nathandunn Thanks for getting back to me,

The trackList.json is:

{
  "refSeqs": "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.fai",
  "tracks": [
    {
         "category" : "Reference sequence",
         "faiUrlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.fai",
         "gziUrlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz.gzi",
         "key" : "Reference sequence remote",
         "label" : "DNA",
         "seqType" : "dna",
         "storeClass" : "JBrowse/Store/SeqFeature/BgzipIndexedFasta",
         "type" : "SequenceTrack",
         "urlTemplate" : "https://legumeinfo.org/data/public/Medicago_truncatula/jemalong_A17.gnm5.MVZ2/medtr.jemalong_A17.gnm5.MVZ2.genome_main.fna.gz",
         "useAsRefSeqStore" : 1
      }
  ]
}

I have this working in an instance of a JBrowse that I can pm you a URL to if that is helpful.

It seems to work in JBrowse, though I pulled it and built it this evening. Could it be that my apollo's jbrowse is too old? It is from about 4 days ago. I can get the exact commit id if that is useful.

thank you

ctcncgr avatar Aug 15 '19 03:08 ctcncgr

@ctcncgr If you could PM me the URL that might be helpful. I'm on the Apollo gitter. @cmdcolin should be there or on the JBrowse one, as well.

Also, could you provide me code from the source panel from where the error is occurring (I realize its minified).

I'll take a look as soon as I can and let you know.

nathandunn avatar Aug 15 '19 04:08 nathandunn

@ctcncgr I looked into it further. I don't think we support gzipped indexed FASTA yet in Apollo. I'll have to look into it next week to see if I can't get it working. In the interim, I would see you couldn't get it working with regular indexed FASTA (as below) remotely or locally.

{
  'formatVersion': 1,
  'refSeqs': 'seq/beerare1.fa.fai',
  'tracks': [
    {
      'category': 'Reference sequence',
      'faiUrlTemplate': 'seq/beerare1.fa.fai',
      'key': 'Reference sequence',
      'label': 'DNA',
      'seqType': 'dna',
      'storeClass': 'JBrowse/Store/SeqFeature/IndexedFasta',
      'type': 'SequenceTrack',
      'urlTemplate': 'seq/beerare1.fa',
      'useAsRefSeqStore': 1
    }
  ]
}

nathandunn avatar Aug 15 '19 16:08 nathandunn

So, @ctcncgr I'm not sure what all of the problems are, but we can leave this open in the interim. However, I did get it to work.

Steps:

  1. rename your .fna file to .fa (yes, I should fix this)
  2. upload the .fa into a new organism according to these instructions for automated loading.

It should just work, but let me know if not:

Screen Shot 2019-08-15 at 12 20 58 PM

You can upload tracks the same way for most standard tracks or manually configure them. You shouldn't see a track if you only have a genome loaded unless you're zoomed in all the way.

image

nathandunn avatar Aug 15 '19 19:08 nathandunn

  • [ ] handle fna naming
  • [ ] support gzip more explicitly
  • [ ] support bgzip more explicitly

nathandunn avatar Aug 15 '19 19:08 nathandunn