Unipressed icon indicating copy to clipboard operation
Unipressed copied to clipboard

Specifying the organism whe using IdMappingClient

Open marcorusc opened this issue 1 year ago • 3 comments

Hello, thanks a lot for developing this package!

I am using Unipressed as part of a package I am developing with my team.

I was wondering if it was possible to specify the organism when translating identifiers like in the following case:

request = IdMappingClient.submit(
    source="Gene_Name", dest="UniProtKB-Swiss-Prot", ids={ "BET1"}
)
time.sleep(1) # Sleep for 1 seconds
list(request.each_result())

this output different translation from Gene_Name to Uniprot, including homologs:

''' {'from': 'BET1', 'to': 'A0A0C6E0I7'}, {'from': 'BET1', 'to': 'O13932'}, {'from': 'BET1', 'to': 'O15155'}, {'from': 'BET1', 'to': 'O35623'}, {'from': 'BET1', 'to': 'P22804'}, {'from': 'BET1', 'to': 'Q62896'} <-- not interested '''

Is it possible to add a flag to filter out all the translations that do not refer to a specific organism?

Thanks a lot for your help!

marcorusc avatar Aug 09 '24 15:08 marcorusc

Good question. Apparently if you are converting to UniProtKB only, there is a special taxonId field that can be used for this purpose:

curl --silent 'https://rest.uniprot.org/configure/idmapping/fields' | jq '.rules | .[] | select(.taxonId)'
{
  "ruleId": 6,
  "tos": [
    "UniProtKB",
    "UniProtKB-Swiss-Prot"
  ],
  "defaultTo": "UniProtKB",
  "taxonId": true
}

More info here. I'll try to implement this when I get time.

multimeric avatar Aug 10 '24 07:08 multimeric

That would be great! Indeed I would specify the TaxonID just for the conversion to UniprotKB.

Thanks a lot for answering so quickly!!

marcorusc avatar Aug 12 '24 09:08 marcorusc

Okay, this is now implemented in #36.

Can you please try installing it using:

pip install git+https://github.com/multimeric/Unipressed@id-mapping-typing

Then in Python, you can use taxon_id which is an integer corresponding to a UniProt taxon ID (https://www.uniprot.org/taxonomy):

from unipressed import IdMappingClient
request = IdMappingClient.submit(
     source="Gene_Name", dest="UniProtKB", ids={"STE2"}, taxon_id=4932
 )

multimeric avatar Aug 15 '24 14:08 multimeric

Looks like it is working fine! I have tried translating a couple of identifiers from genesymbol to UniProtKB-Swiss-Prot and it works perfectly (using taxon id 9606).

ex:

request = IdMappingClient.submit(
    source="Gene_Name", dest="UniProtKB-Swiss-Prot", ids={"BET1"}, taxon_id=9606
)
time.sleep(2) # Sleep for 3 seconds
list(request.each_result())

[{'from': 'BET1', 'to': 'O15155'}]

Thanks a lot for your help!

marcorusc avatar Aug 16 '24 09:08 marcorusc

Thanks for checking! That feature should now be available in release 1.4.0+

multimeric avatar Aug 16 '24 12:08 multimeric