[Bioperl-l] Bio::SearchIO blast_parsing dot_at_the_end_of query_accession

James Wasmuth james.wasmuth at ed.ac.uk
Wed Mar 17 16:10:53 EST 2004


Hi Jason, I work with Ralf, and sensibly he's gone home, but here's the 
report file:

BLASTP 2.2.6 [Apr-09-2003]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ZPP00163	
         (199 letters)

Database: swissall.fsa 
           1,218,016 sequences; 390,760,073 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

NUKM_CAEEL (Q94360) Probable NADH-ubiquinone oxidoreductase 20 k...   177   6e-44
Q9VAK5 (Q9VAK5) CG2014 protein                                        166   1e-40
NUKM_BOVIN (P42026) NADH-ubiquinone oxidoreductase 20 kDa subuni...   166   2e-40
Q9VXK7 (Q9VXK7) CG9172 protein (LD31474P)                             165   4e-40
Q9BV17 (Q9BV17) Similar to CG9172 gene product (NADH:ubiquinone ...   163   1e-39
NUKM_HUMAN (O75251) NADH-ubiquinone oxidoreductase 20 kDa subuni...   163   1e-39
Q7PRW3 (Q7PRW3) ENSANGP00000019428 (Fragment)                         162   2e-39
Q9DC70 (Q9DC70) 1010001M04Rik protein (RIKEN cDNA 1010001M04 gene)    159   2e-38
Q86EG3 (Q86EG3) Clone ZZD432 mRNA sequence                            155   4e-37
Q7PQT1 (Q7PQT1) ENSANGP00000014787 (Fragment)                         150   7e-36
Q7PQU6 (Q7PQU6) ENSANGP00000017108 (Fragment)                         150   7e-36
NUKM_RECAM (O21272) NADH-ubiquinone oxidoreductase 20 kDa subuni...   146   2e-34
Q9G8U4 (Q9G8U4) NADH dehydrogenase subunit 10 (EC 1.6.5.3)            144   5e-34
Q8NAS7 (Q8NAS7) Hypothetical protein FLJ34850                         144   9e-34
Q7RZB4 (Q7RZB4) Hypothetical protein                                  143   2e-33
NUKM_NEUCR (O47950) NADH-ubiquinone oxidoreductase 19.3 kDa subu...   143   2e-33
Q8W0E8 (Q8W0E8) Putative NADH dehydrogenase (Ubiquinone) chain PSST   141   6e-33
Q9TCA4 (Q9TCA4) NADH dehydrogenase subunit 10 (EC 1.6.5.3)            140   1e-32
Q9SP38 (Q9SP38) NADH ubiquinone oxidoreductase PSST subunit           139   2e-32
Q9LKG9 (Q9LKG9) NADH-ubiquinone oxidoreductase subunit PSST (Fra...   139   2e-32
NUKM_SOLTU (Q43844) NADH-ubiquinone oxidoreductase 20 kDa subuni...   139   2e-32
NUKM_BRAOL (P42027) NADH-ubiquinone oxidoreductase 20 kDa subuni...   139   2e-32
NUKM_ARATH (Q42577) NADH-ubiquinone oxidoreductase 20 kDa subuni...   138   5e-32
Q9LKH4 (Q9LKH4) NADH-ubiquinone oxidoreductase subunit PSST           137   1e-31
Q9UUT7 (Q9UUT7) Subunit NUKM of protein NADH:ubiquinone oxidored...   134   5e-31
Q7PBK3 (Q7PBK3) NADH dehydrogenase I chain B                          134   9e-31
NUOB_RICCN (Q92ID6) NADH-quinone oxidoreductase chain B (EC 1.6....   134   9e-31


>>NUKM_CAEEL (Q94360) Probable NADH-ubiquinone oxidoreductase 20 kDa
>  
>
           subunit, mitochondrial precursor (EC 1.6.5.3) (EC
           1.6.99.3) (Complex I-20KD) (CI-20KD)
          Length = 199

 Score =  177 bits (450), Expect = 6e-44
 Identities = 88/161 (54%), Positives = 106/161 (65%)

Query: 39  GKGVFGSPFVQSESKGEWALASLDDVINLCGKTSLWPLTFGLXXXXXXXXHFAAPRYDMD 98
           G    G+PF+   SK E+ALA LDDV+NL  + S+WPLTFGL        HFAAPRYDMD
Sbjct: 31  GIATTGTPFLNPSSKAEYALARLDDVLNLAQRGSIWPLTFGLACCAVEMMHFAAPRYDMD 90

Query: 99  HYGVVFHATPXQVNLILFTGTITNKMAPALHHIYNQMPKPKYVISMESYTNGSGYYHYTY 158
            YGVVF A+P Q +LI   GT+TNKMAPAL  IY+QMP+ K+VISM S  NG GYYHY Y
Sbjct: 91  RYGVVFRASPRQADLIFVAGTVTNKMAPALRRIYDQMPEAKWVISMGSCANGGGYYHYAY 150

Query: 159 SIVHNYNHMXXXXXXXXXXXXTAKXLLYNILQLQKKIKHSR 199
           S++   + +            TA+ LLY +LQLQKKIK  R
Sbjct: 151 SVLRGCDRVIPVDIYVPGCPPTAEALLYGVLQLQKKIKRKR 191



>>Q9VAK5 (Q9VAK5) CG2014 protein
>  
>
          Length = 212

 Score =  166 bits (421), Expect = 1e-40
 Identities = 82/162 (50%), Positives = 103/162 (63%), Gaps = 3/162 (1%)

Query: 38  WGKGVFGSPFVQSESKGEWALASLDDVINLCGKTSLWPLTFGLXXXXXXXXHFAAPRYDM 97
           WG   FG      ++ GEW  A LDD++N   K SLWPLTFGL        H AAPRYDM
Sbjct: 46  WGYSAFGR---NQKTWGEWTCARLDDLLNWGRKGSLWPLTFGLACCAVEMMHIAAPRYDM 102

Query: 98  DHYGVVFHATPXQVNLILFTGTITNKMAPALHHIYNQMPKPKYVISMESYTNGSGYYHYT 157
           D YGVVF A+P Q ++++  GT+TNKMAPA   IY+QMP+P++VISM S  NG GYYHY+
Sbjct: 103 DRYGVVFRASPRQADVLIVAGTLTNKMAPAFRKIYDQMPEPRWVISMGSCANGGGYYHYS 162

Query: 158 YSIVHNYNHMXXXXXXXXXXXXTAKXLLYNILQLQKKIKHSR 199
           YS+V   + +            TA+ L+Y ILQLQKK+K  R
Sbjct: 163 YSVVRGCDRIVPVDIYVPGCPPTAEALMYGILQLQKKVKRMR 204


  Database: swissall.fsa
    Posted date:  Dec 5, 2003  3:22 PM
  Number of letters in database: 390,760,073
  Number of sequences in database:  1,218,016
  
Lambda     K      H
   0.318    0.134    0.417 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 150,792,462
Number of Sequences: 1218016
Number of extensions: 5616912
Number of successful extensions: 15131
Number of sequences better than 1.0e-30: 27
Number of HSP's better than  0.0 without gapping: 10
Number of HSP's successfully gapped in prelim test: 17
Number of HSP's that attempted gapping in prelim test: 15095
Number of HSP's gapped (non-prelim): 27
length of query: 199
length of database: 390,760,073
effective HSP length: 116
effective length of query: 83
effective length of database: 249,470,217
effective search space: 20706028011
effective search space used: 20706028011
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 336 (134.0 bits)
BLASTP 2.2.6 [Apr-09-2003]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ZPP00036_1	
         (229 letters)

Database: swissall.fsa 
           1,218,016 sequences; 390,760,073 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

Q95XJ0 (Q95XJ0) Hypothetical protein                                  256   2e-67
Q95XI9 (Q95XI9) Hypothetical protein                                  256   2e-67
ATPG_DROME (O01666) ATP synthase gamma chain, mitochondrial prec...   202   2e-51
Q7Q3N8 (Q7Q3N8) AgCP11416 (Fragment)                                  197   7e-50
ATPG_BOVIN (P05631) ATP synthase gamma chain, mitochondrial prec...   184   1e-45
Q9ERA8 (Q9ERA8) ATP synthase gamma-subunit precursor                  178   4e-44
Q8TAS0 (Q8TAS0) Hypothetical protein (Fragment)                       178   6e-44
ATPG_MOUSE (Q91VR2) ATP synthase gamma chain, mitochondrial prec...   178   6e-44
ATPG_HUMAN (P36542) ATP synthase gamma chain, mitochondrial prec...   178   6e-44
Q910C3 (Q910C3) Mitochondrial ATP synthase gamma-subunit              177   1e-43
Q9D9D7 (Q9D9D7) 1700094F02Rik protein                                 175   4e-43
Q86DM7 (Q86DM7) Hypothetical protein Y69A2AR.18                       174   1e-42
Q8C2Q8 (Q8C2Q8) ATP synthase                                          172   2e-42
Q7ZXN3 (Q7ZXN3) Similar to ATP synthase, H+ transporting, mitoch...   172   4e-42
ATPG_RAT (P35435) ATP synthase gamma chain, mitochondrial (EC 3....   169   2e-41


>>Q95XJ0 (Q95XJ0) Hypothetical protein
>  
>
          Length = 299

 Score =  256 bits (653), Expect = 2e-67
 Identities = 131/225 (58%), Positives = 164/225 (72%), Gaps = 4/225 (1%)

Query: 5   YGNVEQQRNFATLKDISIRLKSVKNIQKMTXXXXXXXXXXXXXXDRELKGARVYGEGAQA 64
           + N EQ R FATLKDISIRLKSVKNIQK+T              +RELKGAR YG GA+ 
Sbjct: 16  FANAEQARGFATLKDISIRLKSVKNIQKITKSMKMVAAAKYAKAERELKGARAYGVGAKT 75

Query: 65  FYNNLAGDDAAVPKADASATTKKLLILMTSDSGLCGAVHTSIIKEAKNLIKNKPDNMEYK 124
           F++N+   D  V   +   + K++L+L+TSD GLCG VH+SI+KEAKN++ N  D  E +
Sbjct: 76  FFDNI---DPVVEGVEKQESKKQVLVLITSDRGLCGGVHSSIVKEAKNILNNAGDK-EIR 131

Query: 125 LVCIGDKSKAGMSRIYGNHILYTANEIGRLPPTFEDASIAALEILNSGYEFDEAEILYNR 184
           +V IGDKS+AG+ R+Y N IL + NEIGR PP+F DASIAA  IL+SGY+F+   IL+NR
Sbjct: 132 VVAIGDKSRAGLQRLYANSILLSGNEIGRAPPSFADASIAAKAILDSGYDFETGTILFNR 191

Query: 185 FKTVVSYQTSKLQVPPLATIKSNTKLYTYDSVDDDLLQSYAEYSL 229
           FKTVVSY+TSKLQ+ PL  IK+   L TYDSVDDD+LQSY+EYSL
Sbjct: 192 FKTVVSYETSKLQILPLEAIKAKEALSTYDSVDDDVLQSYSEYSL 236



>>Q95XI9 (Q95XI9) Hypothetical protein
>  
>
          Length = 313

 Score =  256 bits (653), Expect = 2e-67
 Identities = 131/225 (58%), Positives = 164/225 (72%), Gaps = 4/225 (1%)

Query: 5   YGNVEQQRNFATLKDISIRLKSVKNIQKMTXXXXXXXXXXXXXXDRELKGARVYGEGAQA 64
           + N EQ R FATLKDISIRLKSVKNIQK+T              +RELKGAR YG GA+ 
Sbjct: 16  FANAEQARGFATLKDISIRLKSVKNIQKITKSMKMVAAAKYAKAERELKGARAYGVGAKT 75

Query: 65  FYNNLAGDDAAVPKADASATTKKLLILMTSDSGLCGAVHTSIIKEAKNLIKNKPDNMEYK 124
           F++N+   D  V   +   + K++L+L+TSD GLCG VH+SI+KEAKN++ N  D  E +
Sbjct: 76  FFDNI---DPVVEGVEKQESKKQVLVLITSDRGLCGGVHSSIVKEAKNILNNAGDK-EIR 131

Query: 125 LVCIGDKSKAGMSRIYGNHILYTANEIGRLPPTFEDASIAALEILNSGYEFDEAEILYNR 184
           +V IGDKS+AG+ R+Y N IL + NEIGR PP+F DASIAA  IL+SGY+F+   IL+NR
Sbjct: 132 VVAIGDKSRAGLQRLYANSILLSGNEIGRAPPSFADASIAAKAILDSGYDFETGTILFNR 191

Query: 185 FKTVVSYQTSKLQVPPLATIKSNTKLYTYDSVDDDLLQSYAEYSL 229
           FKTVVSY+TSKLQ+ PL  IK+   L TYDSVDDD+LQSY+EYSL
Sbjct: 192 FKTVVSYETSKLQILPLEAIKAKEALSTYDSVDDDVLQSYSEYSL 236


  Database: swissall.fsa
    Posted date:  Dec 5, 2003  3:22 PM
  Number of letters in database: 390,760,073
  Number of sequences in database:  1,218,016
  
Lambda     K      H
   0.314    0.132    0.358 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 148,369,851
Number of Sequences: 1218016
Number of extensions: 5173956
Number of successful extensions: 9119
Number of sequences better than 1.0e-30: 15
Number of HSP's better than  0.0 without gapping: 2
Number of HSP's successfully gapped in prelim test: 13
Number of HSP's that attempted gapping in prelim test: 9087
Number of HSP's gapped (non-prelim): 15
length of query: 229
length of database: 390,760,073
effective HSP length: 118
effective length of query: 111
effective length of database: 247,034,185
effective search space: 27420794535
effective search space used: 27420794535
T: 11
A: 40
X1: 16 ( 7.2 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 42 (21.9 bits)
S2: 337 (134.4 bits)
BLASTP 2.2.6 [Apr-09-2003]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Query= ZPP00157	
         (124 letters)

Database: swissall.fsa 
           1,218,016 sequences; 390,760,073 total letters

Searching..................................................done

                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

RL23_CAEEL (P48158) 60S ribosomal protein L23                         207   3e-53
Q7PMD8 (Q7PMD8) ENSANGP00000014430 (Fragment)                         199   6e-51
RL23_AEDAE (Q9GNE2) 60S ribosomal protein L23 (L17A)                  199   6e-51
Q9W1Y7 (Q9W1Y7) RPL17A protein                                        199   7e-51
RL23_BRUMA (Q93140) 60S ribosomal protein L23                         196   6e-50
Q962Y9 (Q962Y9) Ribosomal protein L17/23                              193   4e-49
Q7ZWJ5 (Q7ZWJ5) Hypothetical protein (Fragment)                       191   2e-48
Q90YU5 (Q90YU5) Ribosomal protein L23                                 189   4e-48
Q8IT99 (Q8IT99) Ribosomal protein L17A (Fragment)                     189   4e-48
Q9BTQ7 (Q9BTQ7) Similar to ribosomal protein L23 (Fragment)           189   4e-48
RL23_HUMAN (P23131) 60S ribosomal protein L23 (L17)                   189   4e-48
Q9DFL3 (Q9DFL3) Ribosomal protein L23                                 189   6e-48
Q86D71 (Q86D71) Ribosomal protein L23                                 189   6e-48
RL23_DROME (P48159) 60S ribosomal protein L23 (L17A)                  189   8e-48
Q9W6M3 (Q9W6M3) Ribosomal protein L17                                 188   1e-47
Q9CZE6 (Q9CZE6) 2810009A01Rik protein                                 188   1e-47
Q9DCQ4 (Q9DCQ4) 2810009A01Rik protein                                 186   4e-47
Q9XSU3 (Q9XSU3) Ribosomal protein                                     186   6e-47
Q9YHX0 (Q9YHX0) Ribosomal protein L17 homolog (Fragment)              182   5e-46
O22686 (O22686) F19P19.5 protein                                      179   6e-45
Q9ATF6 (Q9ATF6) Ribosomal protein L17                                 179   6e-45
RL23_TOBAC (Q07760) 60S ribosomal protein L23                         179   6e-45
RL23_ARATH (P49690) 60S ribosomal protein L23                         179   6e-45
Q7X9K1 (Q7X9K1) Ribosomal Pr 117 (Fragment)                           178   1e-44
Q8L8P0 (Q8L8P0) Putative 60S ribosomal protein L17                    178   1e-44
Q7XDL5 (Q7XDL5) 60S ribosomal protein L17                             177   2e-44
Q9AV77 (Q9AV77) 60S ribosomal protein L17                             177   2e-44
O65068 (O65068) 60S ribosomal protein L17 (Fragment)                  176   5e-44
Q9SMI7 (Q9SMI7) 60S ribosomal protein L17 (Fragment)                  174   2e-43
O96636 (O96636) Ribosomal protein L17                                 172   6e-43
RL23_SCHPO (O42867) 60S ribosomal protein L23                         171   2e-42
RL23_YEAST (P04451) 60S ribosomal protein L23 (L17)                   171   2e-42
RL23_TORRU (Q9XEK8) 60S ribosomal protein L23 (L17)                   171   2e-42
Q7R814 (Q7R814) 60S ribosomal protein L23                             164   3e-40
Q7SHJ9 (Q7SHJ9) Hypothetical protein                                  163   3e-40
Q873R3 (Q873R3) Alkaline serine protease (Fragment)                   163   3e-40
Q8IE09 (Q8IE09) 60S ribosomal protein L23, putative                   162   6e-40
Q98RY7 (Q98RY7) 60S ribosomal protein L23                             154   2e-37
Q7QTA5 (Q7QTA5) GLP_15_22119_21691                                    149   9e-36
RL23_TRYCR (Q94776) 60S ribosomal protein L23 (L17) (TCEST082)        149   9e-36
Q9HIR9 (Q9HIR9) Probable 50S ribosomal protein L14                    134   2e-31
Q97BW6 (Q97BW6) Ribosomal protein large subunit L23                   133   4e-31
Q8SRA7 (Q8SRA7) Ribosomal protein L23                                 132   8e-31


>>RL23_CAEEL (P48158) 60S ribosomal protein L23
>  
>
          Length = 140

 Score =  207 bits (526), Expect = 3e-53
 Identities = 102/113 (90%), Positives = 105/113 (92%)

Query: 1   LSLPVGFXINXADNTGANNLFVIAVYGMRGRLNRLPSAGVGDMFVASVKKGKPELRKKVL 60
           L LPVG  +N ADNTGA NLFVI+VYG+RGRLNRLPSAGVGDMFV SVKKGKPELRKKVL
Sbjct: 18  LGLPVGAVMNCADNTGAKNLFVISVYGIRGRLNRLPSAGVGDMFVCSVKKGKPELRKKVL 77

Query: 61  QAVVIRQRKMFRRKDGTFIYFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP 113
           Q VVIRQRK FRRKDGTFIYFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP
Sbjct: 78  QGVVIRQRKQFRRKDGTFIYFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP 130



>>Q7PMD8 (Q7PMD8) ENSANGP00000014430 (Fragment)
>  
>
          Length = 135

 Score =  199 bits (506), Expect = 6e-51
 Identities = 97/113 (85%), Positives = 105/113 (92%)

Query: 1   LSLPVGFXINXADNTGANNLFVIAVYGMRGRLNRLPSAGVGDMFVASVKKGKPELRKKVL 60
           L LPVG  IN ADNTGA NL+VIAV+G+RGRLNRLP+AGVGDMFVA+VKKGKPELRKKV+
Sbjct: 13  LGLPVGAVINCADNTGAKNLYVIAVHGIRGRLNRLPAAGVGDMFVATVKKGKPELRKKVM 72

Query: 61  QAVVIRQRKMFRRKDGTFIYFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP 113
            AVVIRQRK FRR+DG F+YFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP
Sbjct: 73  PAVVIRQRKPFRRRDGVFLYFEDNAGVIVNNKGEMKGSAITGPVAKECADLWP 125


  Database: swissall.fsa
    Posted date:  Dec 5, 2003  3:22 PM
  Number of letters in database: 390,760,073
  Number of sequences in database:  1,218,016
  
Lambda     K      H
   0.321    0.140    0.409 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Hits to DB: 83,601,214
Number of Sequences: 1218016
Number of extensions: 3001107
Number of successful extensions: 5578
Number of sequences better than 1.0e-30: 43
Number of HSP's better than  0.0 without gapping: 42
Number of HSP's successfully gapped in prelim test: 1
Number of HSP's that attempted gapping in prelim test: 5533
Number of HSP's gapped (non-prelim): 43
length of query: 124
length of database: 390,760,073
effective HSP length: 100
effective length of query: 24
effective length of database: 268,958,473
effective search space: 6455003352
effective search space used: 6455003352
T: 11
A: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 332 (132.5 bits)


>
>
> Jason Stajich wrote:
>
>>ralf - can you send an example report - if I use your code and the
>>t/data/ecolitst.bls report in the bioperl disto I get this
>>jason at jason $ perl ralf_bug.pl
>>AAC73113.1
>>AAC73113.1
>>AAC73113.1
>>AAC73113.1
>>
>>also see below.
>>
>>-j
>>On Wed, 17 Mar 2004, Ralf Schmid wrote:
>>
>>  
>>
>>>Hi,
>>>
>>>I have recently updated bioperl from 1.21 to 1.4 and this has broken one of my
>>>blast parsing scripts. Using the following snippet of code on a blast output (3
>>>input sequences, -b 2 option for retrieving only two alignments, otherwise
>>>standard) gives different results:
>>>
>>>=> code
>>>
>>>#!/usr/bin/perl -w
>>>use strict;
>>>use Bio::SearchIO;
>>>my $in = new Bio::SearchIO( -format => 'blast',
>>>                            -file   => "test.out");
>>>my $prot = '';
>>>while( my $result = $in->next_result )  {
>>>  while (my $hit = $result->next_hit) {
>>>    $prot=$result->query_accession;
>>>    print"$prot\n";
>>>  }
>>>}
>>>
>>>
>>>=> output bioperl 1.21:
>>>
>>>ZPP00163
>>>ZPP00163
>>>ZPP00036_1
>>>ZPP00036_1
>>>ZPP00157
>>>ZPP00157
>>>
>>>- query accession is retrieved for each hit where there is an alignment
>>>
>>>=> output bioperl 1.4:
>>>
>>>ZPP00163.
>>>ZPP00163.
>>>ZPP00163.
>>>ZPP00163.
>>>ZPP00163.
>>>ZPP00163.
>>>ZPP00163.
>>>...
>>>
>>>- query_accession is retrieved for each hit regardless whether there is an
>>>alignment or not
>>>- each query_accession ends with a "."
>>>
>>>
>>>So far I have taken advantage of the blast -b option to set the number of hits
>>>to be parsed by bioperl, but I can see the ratio in changing bioperl from
>>>parsing every hit that has an alignment to parsing every hit.
>>>
>>>    
>>>
>>This was a requested feature.  You can add a little code which exists the
>>hit loop if the hit doesn't have any hsps
>>last if $hit->num_hsps == 0;
>>
>>  
>>
>>>Looking at the diff between blast.pm 1.42.2.8 and blast.pm 1.76 and finding the
>>>helpful comment in line 769 makes me believe that there is the change in parsing
>>>coded, but I couldn't spot any reason for the "." at the end of each
>>>query_accession. Not sure whether the two are related anyway.
>>>
>>><SNIP>
>>># This is for the case when we specify -b 0 (or B=0 for WU-BLAST)
>>># and still want to construct minimal Hit objects
>>>while(my $v = shift @hit_signifs) {
>>>next unless defined $v;
>>>$self->start_element({ 'Name' => 'Hit'});
>>>...
>>><SNIP>
>>>
>>>So far I'm fixing the "dot" issue by an s/\.$// , but ...
>>>
>>>
>>>Cheers,
>>>
>>>Ralf
>>>
>>>
>>>
>>>
>>>
>>>
>>>------------------------------------------------------------------------------
>>>Dr. Ralf Schmid
>>>Nematode Bioinformatics
>>>Blaxter Nematode Genomics Group
>>>Institute of Cell, Animal and Population Biology
>>>Ashworth Labs
>>>University of Edinburgh
>>>King's Buildings
>>>Edinburgh
>>>EH9 3JT
>>>UK
>>>
>>>(+44)(0)131 650 7403
>>>
>>>
>>>
>>>
>>>
>>>_______________________________________________
>>>Bioperl-l mailing list
>>>Bioperl-l at portal.open-bio.org
>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>    
>>>
>>
>>--
>>Jason Stajich
>>Duke University
>>jason at cgt.mc.duke.edu
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>  
>>



More information about the Bioperl-l mailing list