[Bioperl-l] retrieving by acc from a local multifasta file
Barry Moore
barry.moore at genetics.utah.edu
Fri Aug 6 08:45:33 EDT 2004
Maria,
If this is a one off script, and you are doing something simple with
your sequences once you extract them, then you may not need to use
BioPerl at all. You could read the complete uniprot_sprot.fasta file
into a hash keyed off of the accession to create a simple database in
memory. Then you can retrieve the sequences you need by accession. It
will take a while to build that hash even on a fairly good computer, so
it's not an approach that you would want to use for a script that you
will run alot. Try the following code.
Barry
-----------------------------------------------------------------------------------
#!/usr/bin/perl
use strict;
use warnings;
#Your list of SwissProt accessions.
my @accs = ('Q43495', 'P13813', 'P15455');
#Open and read your uniprot file.
open (IN, "uniprot_sprot.fasta");
my $uniprot_data = join "", (<IN>);
#Extract fasta sequences into a hash keyed on the accession.
my %seq_db;
while($uniprot_data =~ /^>.*?\(([\d\w]{6})\).*?\n(^(?!>).*\n)+/gm) {
$seq_db{$1} = $&;
}
#Loop over your accessions, and do something with the sequence.
for my $acc (@accs) {
print "$seq_db{$acc}\n\n";
}
---------------------------------------------------------------------------------------
Maria Persico wrote:
>Hi All,
>
>This may be a stupid problem but for me it's something difficult:
>I have a list of swissprot accessions(my_acc) and I want to extract from
>uniprot_sprot.fasta only sequences of my list.
>How can do this with bioperl?
>
>thanks,
>
>Maria
>
>
>
>Maria Persico
>MINT database, Cesareni Group
>Universita' di Tor Vergata, via della Ricerca Scientifica
>00133 Roma, Italy
>Tel: +39 0672594315
>FAX: +39 0672594766
>e-mail: maria at cbm.bio.uniroma2.it
>
>
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l at portal.open-bio.org
>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT
More information about the Bioperl-l
mailing list