[Bioperl-l] bl2seq, Bio::AlignIO and looping through matches
rich
rich at thevillas.eclipse.co.uk
Wed Aug 4 07:12:07 EDT 2004
Hi,
i'm using bl2seq to align 2 sequences and parsing the results with
Bio::AlignIO
As i understand it in the instance where the two sequences match in more
than one position I should be able to loop through each alignment using
$aln = $report_align->next_aln()
docs seem to back this up ......
" The only current exception is format "bl2seq" which parses results of
the Blast bl2seq program and which
may produce several alignment pairs. This set of alignment pairs can
be read using multiple calls to next_aln."
I can't get it to loop, it just pulls out one alignment i.e the first.
Code and example bl2seq file below.
Any help appreciated,
thanks
Rich
use Bio::AlignIO;
use Bio::Align::AlignI;
my $report_align = Bio::AlignIO->new(-file => '/tmp/align' ,
'-format' => 'bl2seq');
my $loop_num=1;
while(my $aln = $report_align->next_aln()){
print $loop_num++,"\n";
}
Query=
(523 letters)
>, 1498 aa.
Length = 1498
Score = 117 bits (292), Expect = 4e-30
Identities = 68/217 (31%), Positives = 120/217 (55%), Gaps = 3/217 (1%)
Query: 299 GEVDVKDVTFTYQGKEKPALSHVSFSIPQGKTVALVGRSGSGKSTIANLFTRFYDVDSGS 358
G++ VKD+T Y L ++SFSI G+ V L+GR+GSGKST+ + F R + + G
Sbjct: 1226 GQMTVKDLTAKYTEGGNAILENISFSISPGQRVGLLGRTGSGKSTLLSAFLRLLNTE-GE
1284
Query: 359 ICLDGHDVRDYKLTNLRRHFALVSQNVHLFNDTIANNIAYAAEGEYTREQIEQAARQAHA 418
I +DG L R+ F ++ Q V +F+ T N+ +++ ++I + A +
Sbjct: 1285 IQIDGVSWDSITLQQWRKAFGVIPQKVFIFSGTFRKNLD--PYEQWSDQEIWKVADEVGL
1342
Query: 419 MEFIENMPQGLDTVIGENGTSLSGGQRQRVAIARALLRDAPVLILDEATSALDTESERAI 478
IE P LD V+ + G LS G +Q + +AR++L A +L+LDE ++ LD + + I
Sbjct: 1343 RSVIEQFPGKLDFVLVDGGCVLSHGHKQLMCLARSVLSKAKILLLDEPSAHLDPVTYQII
1402
Query: 479 QAALDELQKNKTVLVIAHRLSTIEQADEILVVDEGEI 515
+ L + + TV++ HR+ + + + LV++E ++
Sbjct: 1403 RRTLKQAFADCTVILCEHRIEAMLECQQFLVIEENKV 1439
Score = 87.0 bits (214), Expect = 4e-21
Identities = 61/237 (25%), Positives = 116/237 (48%), Gaps = 20/237 (8%)
Query: 278 FGLMDLETERDNGKYEAERVNGEVDVKDVTFTYQGKEKPALSHVSFSIPQGKTVALVGRS 337
FG + + +++N + NG+ + F+ G P L ++F I +G+ +A+ G +
Sbjct: 423 FGELFEKAKQNNNNRKTS--NGDDSLFFSNFSLLGT--PVLKDINFKIERGQLLAVAGST 478
Query: 338 GSGKSTIANLFTRFYDVDSGSICLDGHDVRDYKLTNLRRHFALVSQNVHLFNDTIANNIA 397
G+GK+++ + + G I G + SQ + TI NI
Sbjct: 479 GAGKTSLLMVIMGELEPSEGKIKHSGR-------------ISFCSQFSWIMPGTIKENII 525
Query: 398 YAAEGEYTREQIEQAARQAHAMEFIENMPQGLDTVIGENGTSLSGGQRQRVAIARALLRD 457
+ Y + + E I + + V+GE G +LSGGQR R+++ARA+ +D
Sbjct: 526 FGVS--YDEYRYRSVIKACQLEEDISKFAEKDNIVLGEGGITLSGGQRARISLARAVYKD 583
Query: 458 APVLILDEATSALDTESERAI-QAALDELQKNKTVLVIAHRLSTIEQADEILVVDEG 513
A + +LD LD +E+ I ++ + +L NKT +++ ++ +++AD+IL++ EG
Sbjct: 584 ADLYLLDSPFGYLDVLTEKEIFESCVCKLMANKTRILVTSKMEHLKKADKILILHEG 640
Lambda K H
0.322 0.136 0.384
Gapped
Lambda K H
0.267 0.0410 0.140
Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 1908
Number of Sequences: 0
Number of extensions: 52
Number of successful extensions: 6
Number of sequences better than 10.0: 1
Number of HSP's better than 10.0 without gapping: 1
Number of HSP's successfully gapped in prelim test: 0
Number of HSP's that attempted gapping in prelim test: 0
Number of HSP's gapped (non-prelim): 2
length of query: 523
length of database: 1498
effective HSP length: 43
effective length of query: 480
effective length of database: 1455
effective search space: 698400
effective search space used: 698400
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 30 (16.8 bits)
S2: 30 (16.2 bits)
More information about the Bioperl-l
mailing list