[Bioperl-l] bl2seq, Bio::AlignIO and looping through matches

rich rich at thevillas.eclipse.co.uk
Wed Aug 4 07:12:07 EDT 2004


Hi,

i'm using bl2seq to align 2 sequences and parsing the results with

Bio::AlignIO

As i understand it in the instance where the two sequences match in more 
than one position I should be able to loop through each alignment using 

$aln = $report_align->next_aln()

docs seem to back this up ......
" The only current exception is format "bl2seq" which parses results of 
the Blast bl2seq program and which
may produce several alignment pairs. This set of alignment pairs can
be read using multiple calls to next_aln."

I can't get it to loop, it just pulls out one alignment i.e the first.
Code and example bl2seq file below.


Any help appreciated,
thanks
Rich

use Bio::AlignIO;
use Bio::Align::AlignI;

my $report_align  = Bio::AlignIO->new(-file => '/tmp/align' ,
                                               '-format' => 'bl2seq');

my $loop_num=1;

while(my $aln = $report_align->next_aln()){
           
   
        print $loop_num++,"\n";
   
}







Query=
         (523 letters)

 >, 1498 aa.
          Length = 1498

 Score =  117 bits (292), Expect = 4e-30
 Identities = 68/217 (31%), Positives = 120/217 (55%), Gaps = 3/217 (1%)

Query: 299  GEVDVKDVTFTYQGKEKPALSHVSFSIPQGKTVALVGRSGSGKSTIANLFTRFYDVDSGS 358
            G++ VKD+T  Y       L ++SFSI  G+ V L+GR+GSGKST+ + F R  + + G
Sbjct: 1226 GQMTVKDLTAKYTEGGNAILENISFSISPGQRVGLLGRTGSGKSTLLSAFLRLLNTE-GE 
1284

Query: 359  ICLDGHDVRDYKLTNLRRHFALVSQNVHLFNDTIANNIAYAAEGEYTREQIEQAARQAHA 418
            I +DG       L   R+ F ++ Q V +F+ T   N+      +++ ++I + A +
Sbjct: 1285 IQIDGVSWDSITLQQWRKAFGVIPQKVFIFSGTFRKNLD--PYEQWSDQEIWKVADEVGL 
1342

Query: 419  MEFIENMPQGLDTVIGENGTSLSGGQRQRVAIARALLRDAPVLILDEATSALDTESERAI 478
               IE  P  LD V+ + G  LS G +Q + +AR++L  A +L+LDE ++ LD  + + I
Sbjct: 1343 RSVIEQFPGKLDFVLVDGGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDPVTYQII 
1402

Query: 479  QAALDELQKNKTVLVIAHRLSTIEQADEILVVDEGEI 515
            +  L +   + TV++  HR+  + +  + LV++E ++
Sbjct: 1403 RRTLKQAFADCTVILCEHRIEAMLECQQFLVIEENKV 1439



 Score = 87.0 bits (214), Expect = 4e-21
 Identities = 61/237 (25%), Positives = 116/237 (48%), Gaps = 20/237 (8%)

Query: 278 FGLMDLETERDNGKYEAERVNGEVDVKDVTFTYQGKEKPALSHVSFSIPQGKTVALVGRS 337
           FG +  + +++N   +    NG+  +    F+  G   P L  ++F I +G+ +A+ G +
Sbjct: 423 FGELFEKAKQNNNNRKTS--NGDDSLFFSNFSLLGT--PVLKDINFKIERGQLLAVAGST 478

Query: 338 GSGKSTIANLFTRFYDVDSGSICLDGHDVRDYKLTNLRRHFALVSQNVHLFNDTIANNIA 397
           G+GK+++  +     +   G I   G               +  SQ   +   TI  NI
Sbjct: 479 GAGKTSLLMVIMGELEPSEGKIKHSGR-------------ISFCSQFSWIMPGTIKENII 525

Query: 398 YAAEGEYTREQIEQAARQAHAMEFIENMPQGLDTVIGENGTSLSGGQRQRVAIARALLRD 457
           +     Y   +     +     E I    +  + V+GE G +LSGGQR R+++ARA+ +D
Sbjct: 526 FGVS--YDEYRYRSVIKACQLEEDISKFAEKDNIVLGEGGITLSGGQRARISLARAVYKD 583

Query: 458 APVLILDEATSALDTESERAI-QAALDELQKNKTVLVIAHRLSTIEQADEILVVDEG 513
           A + +LD     LD  +E+ I ++ + +L  NKT +++  ++  +++AD+IL++ EG
Sbjct: 584 ADLYLLDSPFGYLDVLTEKEIFESCVCKLMANKTRILVTSKMEHLKKADKILILHEG 640


Lambda     K      H
   0.322    0.136    0.384

Gapped
Lambda     K      H
   0.267   0.0410    0.140


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 1908
Number of Sequences: 0
Number of extensions: 52
Number of successful extensions: 6
Number of sequences better than 10.0: 1
Number of HSP's better than 10.0 without gapping: 1
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 0
Number of HSP's gapped (non-prelim): 2
length of query: 523
length of database: 1498
effective HSP length: 43
effective length of query: 480
effective length of database: 1455
effective search space:   698400
effective search space used:   698400
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 30 (16.8 bits)
S2: 30 (16.2 bits)




More information about the Bioperl-l mailing list