[EMBOSS] Batch retrieval of taxonomy/species names using entret.....

David.Bauer at SCHERING.DE David.Bauer at SCHERING.DE
Wed Nov 1 07:21:07 UTC 2006


If you want to parse Swissprot files which you get from entret, you could
have a look at Swissknife, which is "An object-oriented Perl library to
handle Swiss-Prot entries":

http://swissknife.sourceforge.net/docs/

This can be used to access all header information in sprot files, also the
information in the special format CC lines.

David.

emboss-bounces at lists.open-bio.org schrieb am 31/10/2006 19:53:00:

> Hi Richard,
>
> Richard Rothery wrote:
> > I am interested in using entret to retrieve single field entries from
> > swissprot or sptrembl. Specifically, I would like to feed entret a
list
> > of accessions and have it return a file with the species names and/or
> > taxonomies. I intend to use this information to compare with my
> > phylogeny analyses of clustalw alignments.
>
> EMBOSS stores the full text in entret without parsing.
>
> We could try to extract specific fields but it is not easy to definethem
for
> all formats.
>
> You can do this with SRS. Try the EBI server for example:
>
> Go to the library page
>
> Select UniProtKB/SwissProt (or UniProtKB/TrEMBL)
>
> Select "standard query form"
>
> Enter your query in the top part (e.g. accession number)
>
> In the "create a view" section click the "list" button to egt the
original
> lines. Select anything taxonomic from the pull down list (control-click
to
> select more than one)
>
> Press "search".
>
> refine your query. You will see the URL at the top that can be used
> to retrieve
> data when you are happy.
>
> Failing that, you could just parse out the ID and O* lines from
> entret using a
> simple perl script.
>
> Hope that helps,
>
> Peter
>
> _______________________________________________
> EMBOSS mailing list
> EMBOSS at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/emboss




More information about the EMBOSS mailing list