rnacentral-webcode icon indicating copy to clipboard operation
rnacentral-webcode copied to clipboard

Reimport Modomics data

Open AntonPetrov opened this issue 8 years ago • 1 comments

At least one species-specific id shows several species at once (example): screen shot 2017-02-22 at 09 30 14

The problem happens when the same sequence has the same modifications in multiple species (and the Accession model gets overwritten), so Modomics data needs to be reimported from scratch.

The problem was originally reported by Sean.

AntonPetrov avatar Feb 22 '17 09:02 AntonPetrov

There are at least 23 xrefs/accessions with this issue. We can find them by doing:

select
  *
from xref, rnc_accessions acc
where
  xref.ac = acc.accession
  and acc.database = 'MODOMICS'
  and (
    (species = 'Xenopus laevis' and taxid != 8355)
    or (species = 'Rattus norvegicus' and taxid != 10116)
    or (species = 'Zea mays' and taxid != 4577)
    or (species = 'Thermus thermophilus' and taxid != 274)
    or (species = 'Salmonella typhimurium' and taxid != 90371)
    or (species = 'Phaseolus vulgaris' and taxid != 3885)
    or (species = 'Oryctolagus cuniculus' and taxid != 9986)
    or (species = 'Triticum aestivum' and taxid != 4565)
  )
;

it appears to be limited to modomics as doing the search without the modomics constraint gives the same results.

Fixing the accessions can be done with:

-- Update Xenopus
update xref
  set taxid = 8355
where
  ac in ('dd7318229bd33f71098d491b437b97dd_modomics',
         '5b638f7a6fb817e74ea1fc05eb7aca6a_modomics')
  and taxid != 8355
;

-- Update rat
update xref
  set taxid = 10116
where
  ac in ('b5f224875fe4b55c0f8c79ff9e1c4b96_modomics')
  and taxid = 10090
;

-- Update maize
update xref
  set taxid = 4577
where
  ac in ('331f68e0cd1ed4d69e6ce052f24d432c_modomics')
  and taxid = 3562
;

-- Update thermus
update xref
  set taxid = 274
where
  ac in ('367fff4928ff6e45035eccd25315ae9d_modomics', 
         '3fe73d3ce3932d8042cec7866b809ac0_modomics')
  and taxid = 300852
;

-- Update Salmonella
update xref
  set taxid = 90371
where
  ac in ('3c9f4214774de3fd21eb099235728829_modomics')
  and taxid = 562
;

-- Update kidney bean
update xref
  set taxid = 3885
where
  ac in ('ae35295009e8a43132a40112333bbcc1_modomics')
  and taxid = 3847
;

-- Update rabbit
update xref
  set taxid = 9986
where
  ac in ('484a32153536ff19c456a10b99106d82_modomics',
         '2b7215b44be5fa48144480e645e1d4b1_modomics',
         '484a32153536ff19c456a10b99106d82_modomics')
  and taxid != 9986
;

-- Update wheat
update xref
  set taxid = 4565
where
  ac in ('1cc28fdd9201cb2ab75c52dd846b649f_modomics',
         '851cde158600ec1bb7cb41827451d795_modomics',
         '512c6fee503aae504bbab970efa856a1_modomics',
         '9499216fa6efa3c4f2b9a7bbb8dc1548_modomics',
         '1dd5a0399007891f1ec154143a319d21_modomics')
  and taxid != 4565
;

blakesweeney avatar Feb 22 '17 13:02 blakesweeney