[Bioperl-l] bioperl-AlignIO problems parsing fasta files

Jason Stajich jason.stajich at duke.edu
Fri May 5 13:21:38 UTC 2006


Space after the > is causing the problem since we infer the ID as the  
everything after the '>' BEFORE the first whitespace.  Get rid of the  
space.
   $ perl -i.backup -p -e 's/^>\s+/>/' YOURFASALNFILE

On May 4, 2006, at 7:00 PM, Gloria Rendon wrote:

> contents of the input file has a single sequence:
>
>> gi|90108701|pdb|2AHZ|B Chain B, K+ Complex Of The Nak Channel
> MLSFLLTLKRMLRACLRAWKDKEFQVLFVLTILTLISGTIFYSTVEGLRPIDALYFSVVTLTTVGDGNFS
> PQTDFGKIFTILYIFIGIGLVFGFIHKLAVNVQLPSILSN
> ------------------------------------------
> this is the script that tries to parse it:
>
> use Bio::AlignIO;
> my $inseq = Bio::AlignIO->new(-format => 'fasta',
>                            -file   => 'test.fasta');
> while( my $aln = $inseq->next_aln ) {
>      print "name: ", $aln->displayname;
>      print "length: ", $aln->length;
>      print "\n";
> }
>
> ------------------------------------------
> and this is the result of running that script on winxp
>
> D:\msa\NAK MUTANTS>perl parseFasta.pl
>
>
> ------------- EXCEPTION  -------------
> MSG: No sequence with name []
> STACK Bio::SimpleAlign::displayname
> C:/Perl/site/lib/Bio/SimpleAlign.pm:2047
> STACK toplevel parseFasta.pl:11
>
> --------------------------------------
> D:\msa\NAK MUTANTS>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12/





More information about the Bioperl-l mailing list