[Bioperl-l] Re: Bioperl and matcher
Jason Stajich
jason@cgt.mc.duke.edu
Tue, 26 Nov 2002 11:05:22 -0500 (EST)
Our msf parser is seeing something it isn't expecting - not sure why -
what happens when you just use the straight 'emboss' parser with standard
emboss alignment output which is the route that has been most heavily
tested?
-jason
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
On Tue, 26 Nov 2002, Vilanova,David,LAUSANNE,NRC/BS wrote:
>
> Hello,
> I have problems retrieving the alignments from an emboss output.
> The program belows read 2 files and runs a matcher of all against all.
> Matcher gives me an msf output and then I try to parse this alignment with
> Bio::AlignIO.
> However I get an exception...
>
> Processing sequence 1..vs..3...done
>
> ------------- EXCEPTION -------------
> MSG: 1 exists as an alignment line but not in the header. Not confident of
> what is going on!
> STACK Bio::AlignIO::msf::next_aln
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/AlignIO/msf.pm:106
> STACK toplevel Run_Emboss.pl:50
>
> --------------------------------------
>
> Here is the output from matcher:
> !!NA_MULTIPLE_ALIGNMENT 1.0
>
> out MSF: 5 Type: N 26/11/02 CompCheck: 2090 ..
>
> Name: EMBOSS_001 Len: 5 Check: 1045 Weight: 1.00
> Name: EMBOSS_002 Len: 5 Check: 1045 Weight: 1.00
>
> //
>
> 1 5
> EMBOSS_001 CGGCG
> EMBOSS_002 CGGCG
>
>
> ###########################################################
> It doesn't work for fasta format as well in my script (see output below):
> Processing sequence 1..vs..3...done
> Use of uninitialized value in sprintf at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 257, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 268, <GEN2>
> line 4.
> Use of uninitialized value in hash element at
> /usr/local/lib/perl5/site_perl/5.8.0/Bio/SimpleAlign.pm line 270, <GEN2>
> line 4.
>
> #########################
>
>
> #Script
> #! /usr/bin/perl -w
>
> use Bio::Factory::EMBOSS;
> use Bio::SeqIO;
> use Bio::AlignIO;
>
> die "Usage: perl script.pl [seqfileA] [seqfileB] [outfile]\n" unless @ARGV
> eq '3';
>
> #Read input files
> ($seqfileA,$seqfileB,$outfile) = @ARGV;
>
> #Initialize Object
> $EMBOSS = new Bio::Factory::EMBOSS;
>
> #Define emboss program to run
> $application = $EMBOSS->program('matcher');
>
> #Manipulate SeqfileA file
> $seqA = new Bio::SeqIO (-file => $seqfileA,
> -format => 'fasta');
>
>
> while ($seqinA = $seqA->next_seq){
> $inseqA = "asis::".$seqinA->seq;
> $seqidA = $seqinA->id;
>
>
> print "####$seqidA\n";
> #Initialize seqB at every iteration of SeqA
> $seqB = new Bio::SeqIO (-file => $seqfileB,
> -format => 'fasta');
>
> while ($seqinB = $seqB->next_seq){
> $inseqB = "asis::".$seqinB->seq; #Format like asis::ATGCGA (required for
> emboss)
> $seqidB = $seqinB->id;
>
> print "Processing sequence $seqidA..vs..$seqidB...";
>
> #Define program parameters and run...
> $application->run({
> -sequencea => $inseqA,
> -sequenceb => $inseqB,
> -aformat => 'msf',
> -outfile => $outfile });
> print "done\n";
>
> $alnin = new Bio::AlignIO(-format => 'msf',
> -file => $outfile );
>
> while ($aln = $alnin->next_aln){
> print $aln->no_residues,"\n";
> #print $aln->consensus_string,"\n";
>
> }
> }
> }
>
>
>
>
>
>
>
>
>