[Bioperl-l] parsing protein accession numbers and types from >fasta headers
Antonio Ramos Fernández
tniram at hotmail.com
Wed Sep 13 10:50:08 UTC 2006
I'd like to write a script to parse fasta headers of fasta-formatted protein
databases and get protein accession numbers and identifiers (uniprot, IPI,
gi, Refseq, ensembl...). The idea is building a simple local database that
relates an accession number for protein sequence with all valid identifiers
and the fasta files from where they weher obtained at my system, or
checking, for instance, if an uniprot accession exists for a given gi.
However, the structure of the fasta header is quite variable depending on
the source. Any suggestions?
Horóscopo, tarot, numerología... Escucha lo que te dicen los astros.
More information about the Bioperl-l