[Bioperl-l] Re: Bioperl and matcher

Jason Stajich jason@cgt.mc.duke.edu
Tue, 26 Nov 2002 11:05:22 -0500 (EST)


Our msf parser is seeing something it isn't expecting - not sure why -
what happens when you just use the straight 'emboss' parser with standard
emboss alignment output which is the route that has been most heavily
tested?

-jason

Jason Stajich
Duke University
jason at cgt.mc.duke.edu

On Tue, 26 Nov 2002, Vilanova,David,LAUSANNE,NRC/BS wrote:

>
> Hello,
> I have problems retrieving the alignments from an emboss output.
> The program belows read 2 files and runs a matcher of all against all.
> Matcher gives me an msf output and then I try to parse this alignment with
> Bio::AlignIO.
> However I get an exception...
>
> Processing sequence 1..vs..3...done
>
> ------------- EXCEPTION  -------------
> MSG: 1 exists as an alignment line but not in the header. Not confident of
> what is going on!
> STACK Bio::AlignIO::msf::next_aln
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/AlignIO/msf.pm:106
> STACK toplevel Run_Emboss.pl:50
>
> --------------------------------------
>
> Here is the output from matcher:
> !!NA_MULTIPLE_ALIGNMENT 1.0
>
>   out MSF: 5 Type: N 26/11/02 CompCheck: 2090 ..
>
>   Name: EMBOSS_001 Len: 5  Check: 1045 Weight: 1.00
>   Name: EMBOSS_002 Len: 5  Check: 1045 Weight: 1.00
>
> //
>
>            1   5
> EMBOSS_001 CGGCG
> EMBOSS_002 CGGCG
>
>
> ###########################################################
> It doesn't work for fasta format as well in my script (see output below):
> Processing sequence 1..vs..3...done
> Use of uninitialized value in sprintf at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 257, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 270, <GEN2>
> line 4.
>
> #########################
>
>
> #Script
> #! /usr/bin/perl -w
>
> use Bio::Factory::EMBOSS;
> use Bio::SeqIO;
> use Bio::AlignIO;
>
> die "Usage: perl script.pl [seqfileA] [seqfileB] [outfile]\n" unless @ARGV
> eq '3';
>
> #Read input files
> ($seqfileA,$seqfileB,$outfile) = @ARGV;
>
> #Initialize Object
> $EMBOSS = new Bio::Factory::EMBOSS;
>
> #Define emboss program to run
> $application = $EMBOSS->program('matcher');
>
> #Manipulate SeqfileA file
> $seqA = new Bio::SeqIO (-file => $seqfileA,
>    -format => 'fasta');
>
>
> while ($seqinA = $seqA->next_seq){
>     $inseqA = "asis::".$seqinA->seq;
>     $seqidA = $seqinA->id;
>
>
>     print "####$seqidA\n";
>     #Initialize seqB at every iteration of SeqA
>     $seqB = new Bio::SeqIO (-file => $seqfileB,
>        -format => 'fasta');
>
>     while ($seqinB = $seqB->next_seq){
>  $inseqB = "asis::".$seqinB->seq; #Format like asis::ATGCGA (required for
> emboss)
>  $seqidB = $seqinB->id;
>
>  print "Processing sequence $seqidA..vs..$seqidB...";
>
>  #Define program parameters and run...
>  $application->run({
>      -sequencea => $inseqA,
>      -sequenceb => $inseqB,
>      -aformat => 'msf',
>      -outfile => $outfile });
>  print "done\n";
>
>  $alnin = new Bio::AlignIO(-format => 'msf',
>       -file  => $outfile    );
>
>  while ($aln = $alnin->next_aln){
>      print $aln->no_residues,"\n";
>      #print $aln->consensus_string,"\n";
>
>  }
>     }
> }
>
>
>
>
>
>
>
>
>