[Bioperl-l] Bio::Tools::Run::StandAloneBlast.pm - bl2seq question

Caleb Davis cdavis at bcm.tmc.edu
Wed Aug 18 19:19:53 UTC 2010


Hello, thank you for bioperl!

I am getting discrepancies between the online bl2seq 
(www.ncbi.nlm.nih.gov/blast/*bl2seq*/wblast2.cgi) and bioperl's 
implementation, and I'm not sure why. I'm seeing a desired behavior 
through the web interface but can't replicate it locally. Specifically, 
online bl2seq aligns across a 1 bp insertion in the subject whereas the 
local bl2seq just reports a shorter alignment.

Any ideas? Thanks again,
--Caleb

The desired parameter differences from default are -F F -W 7 (turn 
complexity filter off, word size = 7). Below I present the online and 
local results given the following input sequences:

 >consensus
GAGGATCCAGAATTCTC
 >FVFTF6N01A86BR
AACCCAATGTAAGGAAGCTAAGAACCTTGAAAAGAGGATACCAGAATTCTC

Here are the parameters and result I'm getting online:
Blast4-request ::= {
  body queue-search {
    program "blastn",
    service "plain",
    queries bioseq-set {
      seq-set {
        seq {
          id {
            local id 26297
          },
          descr {
            title "consensus",
            user {
              type str "CFastaReader",
              data {
                {
                  label str "DefLine",
                  data str ">consensus"
                }
              }
            }
          },
          inst {
            repr raw,
            mol na,
            length 17,
            seq-data ncbi2na '8A3520F740'H
          }
        }
      }
    },
    subject sequences {
      {
        id {
          local id 26299
        },
        descr {
          title "FVFTF6N01A86BR",
          user {
            type str "CFastaReader",
            data {
              {
                label str "DefLine",
                data str ">FVFTF6N01A86BR"
              }
            }
          }
        },
        inst {
          repr raw,
          mol na,
          length 51,
          seq-data ncbi2na '0543B0A09C205F80228C520F74'H
        }
      }
    },
    algorithm-options {
      {
        name "EvalueThreshold",
        value cutoff e-value { 1, 10, 1 }
      },
      {
        name "UngappedMode",
        value boolean FALSE
      },
      {
        name "PercentIdentity",
        value real { 0, 10, 0 }
      },
      {
        name "HitlistSize",
        value integer 100
      },
      {
        name "EffectiveSearchSpace",
        value big-integer 0
      },
      {
        name "DbLength",
        value big-integer 0
      },
      {
        name "WindowSize",
        value integer 0
      },
      {
        name "DustFiltering",
        value boolean FALSE
      },
      {
        name "RepeatFiltering",
        value boolean FALSE
      },
      {
        name "MaskAtHash",
        value boolean TRUE
      },
      {
        name "MismatchPenalty",
        value integer -3
      },
      {
        name "MatchReward",
        value integer 2
      },
      {
        name "GapOpeningCost",
        value integer 5
      },
      {
        name "GapExtensionCost",
        value integer 2
      },
      {
        name "StrandOption",
        value strand-type both-strands
      },
      {
        name "WordSize",
        value integer 7
      }
    },
    format-options {
      {
        name "Web_JobTitle",
        value string "consensus"
      },
      {
        name "Web_BlastSpecialPage",
        value string "blast2seq"
      }
    }
  }
}

 >lcl|30439 FVFTF6N01A86BR
Length=51


                                                         Sort alignments 
for this subject sequence by:
                                                           E value  
Score  Percent identity
                                                           Query start 
position  Subject start position
 Score = 24.7 bits (26),  Expect = 2e-05
 Identities = 17/18 (94%), Gaps = 1/18 (5%)
 Strand=Plus/Plus

Query  1   GAGGAT-CCAGAATTCTC  17
           |||||| |||||||||||
Sbjct  34  GAGGATACCAGAATTCTC  51

Here's the output from a local search (I changed the expect to 5.0 just 
to prove to myself that some parameters are getting through OK):
my @params = (-program => 'blastn', -outfile => 'bl2seq.out', -FILTER => 
'F', -WORDSIZE => 7, -expect => 5.0);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
my $bl2seq_report = $factory->bl2seq($cons_seqobj, $single_seqobj); 
#consensus vs. FVFTF6N01A86BR
print Dumper $bl2seq_report->next_result;

$VAR1 = bless( {
                 '_inclusion_threshold' => undef,
                 '_queryacc' => 'adapter_consensus',
                 '_iteration_index' => 0,
                 '_iteration_count' => 1,
                 '_hits' => [],
                 '_hitindex' => 0,
                 '_querylength' => '17',
                 '_querydesc' => '',
                 '_iterations' => [
                                    bless( {
                                             
'_oldhits_not_below_threshold' => [],
                                             '_newhits_unclassified' => [],
                                             '_number' => 1,
                                             
'_oldhits_newly_below_threshold' => [],
                                             '_hit_factory' => bless( {
                                                                        
'interface' => 'Bio::Search::Hit::HitI',
                                                                        
'type' => 'Bio::Search::Hit::BlastHit',
                                                                        
'_loaded_types' => {
                                                                                             
'Bio::Search::Hit::BlastHit' => 1
                                                                                           
},
                                                                        
'_root_verbose' => 0
                                                                      }, 
'Bio::Factory::ObjectFactory' ),
                                             '_newhits_below_threshold' => [
                                                                             
{
                                                                               
'-algorithm' => 'BLASTN',
                                                                               
'-description' => '',
                                                                               
'-length' => '51',
                                                                               
'-query_len' => '17',
                                                                               
'-hsp_factory' => bless( {
                                                                                                          
'interface' => 'Bio::Search::HSP::HSPI',
                                                                                                          
'type' => 'Bio::Search::HSP::GenericHSP',
                                                                                                          
'_loaded_types' => {
                                                                                                                               
'Bio::Search::HSP::GenericHSP' => 1
                                                                                                                             
},
                                                                                                          
'_root_verbose' => 0
                                                                                                        
}, 'Bio::Factory::ObjectFactory' ),
                                                                               
'-name' => 'FVFTF6N01A86BR',
                                                                               
'-rank' => 1,
                                                                               
'-hsps' => [
                                                                                            
{
                                                                                              
'-query_start' => '7',
                                                                                              
'-algorithm' => 'BLASTN',
                                                                                              
'-hit_seq' => 'ccagaattctc',
                                                                                              
'-hit_length' => '51',
                                                                                              
'-query_length' => '17',
                                                                                              
'-query_desc' => '',
                                                                                              
'-query_frame' => 0,
                                                                                              
'-rank' => 1,
                                                                                              
'-hit_desc' => '',
                                                                                              
'-query_end' => '17',
                                                                                              
'-hit_name' => 'FVFTF6N01A86BR',
                                                                                              
'-identical' => '11',
                                                                                              
'-query_name' => 'adapter_consensus',
                                                                                              
'-evalue' => '1e-04',
                                                                                              
'-score' => '11',
                                                                                              
'-conserved' => '11',
                                                                                              
'-hit_frame' => 0,
                                                                                              
'-hsp_length' => '11',
                                                                                              
'-query_seq' => 'ccagaattctc',
                                                                                              
'-hit_start' => '41',
                                                                                              
'-homology_seq' => '|||||||||||',
                                                                                              
'-hit_end' => '51',
                                                                                              
'-bits' => '22.3'
                                                                                            
},
                                                                                            
{
                                                                                              
'-query_start' => '9',
                                                                                              
'-algorithm' => 'BLASTN',
                                                                                              
'-hit_seq' => 'agaattct',
                                                                                              
'-hit_length' => '51',
                                                                                              
'-query_length' => '17',
                                                                                              
'-query_desc' => '',
                                                                                              
'-query_frame' => 0,
                                                                                              
'-rank' => 2,
                                                                                              
'-hit_desc' => '',
                                                                                              
'-query_end' => '16',
                                                                                              
'-hit_name' => 'FVFTF6N01A86BR',
                                                                                              
'-identical' => '8',
                                                                                              
'-query_name' => 'adapter_consensus',
                                                                                              
'-evalue' => '0.007',
                                                                                              
'-score' => '8',
                                                                                              
'-conserved' => '8',
                                                                                              
'-hit_frame' => 0,
                                                                                              
'-hsp_length' => '8',
                                                                                              
'-query_seq' => 'agaattct',
                                                                                              
'-hit_start' => '50',
                                                                                              
'-homology_seq' => '||||||||',
                                                                                              
'-hit_end' => '43',
                                                                                              
'-bits' => '16.4'
                                                                                            
}
                                                                                          
],
                                                                               
'-accession' => 'FVFTF6N01A86BR',
                                                                               
'-significance' => '1e-04'
                                                                             
}
                                                                           
],
                                             '_root_verbose' => 0,
                                             
'_newhits_not_below_threshold' => [],
                                             '_oldhits_below_threshold' 
=> []
                                           }, 
'Bio::Search::Iteration::GenericIteration' )
                                  ],
                 '_hit_factory' => 
$VAR1->{'_iterations'}[0]{'_hit_factory'},
                 '_statistics' => bless( {
                                           'stats' => {
                                                        'S1' => '4',
                                                        'S1_bits' => '8.4',
                                                        'kappa_gapped' 
=> '0.711',
                                                        'X3_bits' => '99.1',
                                                        'X1' => '4',
                                                        'lambda_gapped' 
=> '1.37',
                                                        'X2' => '15',
                                                        'S2' => '4',
                                                        
'seqs_better_than_cutoff' => '1',
                                                        'Hits_to_DB' => '5',
                                                        'num_extensions' 
=> '2',
                                                        
'num_successful_extensions' => '2',
                                                        'X1_bits' => '7.9',
                                                        'X3' => '50',
                                                        'dbentries' => '1',
                                                        'entropy_gapped' 
=> '1.31',
                                                        'X2_bits' => '29.7',
                                                        'S2_bits' => '8.4'
                                                      }
                                         }, 
'Bio::Search::GenericStatistics' ),
                 '_algorithm' => 'BLASTN',
                 '_parameters' => bless( {
                                           'params' => {
                                                         'gapext' => '2',
                                                         'matrix' => 
'blastn matrix:1 -3',
                                                         'expect' => '5.0',
                                                         'allowgaps' => 
'yes',
                                                         'gapopen' => '5'
                                                       }
                                         }, 
'Bio::Tools::Run::GenericParameters' ),
                 '_root_verbose' => 0,
                 '_queryname' => 'adapter_consensus'
               }, 'Bio::Search::Result::BlastResult' );




More information about the Bioperl-l mailing list