[EMBOSS] Batch retrieval of taxonomy/species names using entret.....

David.Bauer at SCHERING.DE David.Bauer at SCHERING.DE
Wed Nov 1 07:21:07 UTC 2006

If you want to parse Swissprot files which you get from entret, you could
have a look at Swissknife, which is "An object-oriented Perl library to
handle Swiss-Prot entries":


This can be used to access all header information in sprot files, also the
information in the special format CC lines.


emboss-bounces at lists.open-bio.org schrieb am 31/10/2006 19:53:00:

> Hi Richard,
> Richard Rothery wrote:
> > I am interested in using entret to retrieve single field entries from
> > swissprot or sptrembl. Specifically, I would like to feed entret a
> > of accessions and have it return a file with the species names and/or
> > taxonomies. I intend to use this information to compare with my
> > phylogeny analyses of clustalw alignments.
> EMBOSS stores the full text in entret without parsing.
> We could try to extract specific fields but it is not easy to definethem
> all formats.
> You can do this with SRS. Try the EBI server for example:
> Go to the library page
> Select UniProtKB/SwissProt (or UniProtKB/TrEMBL)
> Select "standard query form"
> Enter your query in the top part (e.g. accession number)
> In the "create a view" section click the "list" button to egt the
> lines. Select anything taxonomic from the pull down list (control-click
> select more than one)
> Press "search".
> refine your query. You will see the URL at the top that can be used
> to retrieve
> data when you are happy.
> Failing that, you could just parse out the ID and O* lines from
> entret using a
> simple perl script.
> Hope that helps,
> Peter
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss

More information about the EMBOSS mailing list