[EMBOSS] Batch retrieval of taxonomy/species names using entret.....
pmr at ebi.ac.uk
Tue Oct 31 18:53:00 UTC 2006
Richard Rothery wrote:
> I am interested in using entret to retrieve single field entries from
> swissprot or sptrembl. Specifically, I would like to feed entret a list
> of accessions and have it return a file with the species names and/or
> taxonomies. I intend to use this information to compare with my
> phylogeny analyses of clustalw alignments.
EMBOSS stores the full text in entret without parsing.
We could try to extract specific fields but it is not easy to define them for
You can do this with SRS. Try the EBI server for example:
Go to the library page
Select UniProtKB/SwissProt (or UniProtKB/TrEMBL)
Select "standard query form"
Enter your query in the top part (e.g. accession number)
In the "create a view" section click the "list" button to egt the original
lines. Select anything taxonomic from the pull down list (control-click to
select more than one)
refine your query. You will see the URL at the top that can be used to retrieve
data when you are happy.
Failing that, you could just parse out the ID and O* lines from entret using a
simple perl script.
Hope that helps,
More information about the EMBOSS