[Bioperl-l] Bio::Tools::Run::StandAloneBlast.pm - bl2seq question
Caleb Davis
cdavis at bcm.tmc.edu
Wed Aug 18 19:19:53 UTC 2010
Hello, thank you for bioperl!
I am getting discrepancies between the online bl2seq
(www.ncbi.nlm.nih.gov/blast/*bl2seq*/wblast2.cgi) and bioperl's
implementation, and I'm not sure why. I'm seeing a desired behavior
through the web interface but can't replicate it locally. Specifically,
online bl2seq aligns across a 1 bp insertion in the subject whereas the
local bl2seq just reports a shorter alignment.
Any ideas? Thanks again,
--Caleb
The desired parameter differences from default are -F F -W 7 (turn
complexity filter off, word size = 7). Below I present the online and
local results given the following input sequences:
>consensus
GAGGATCCAGAATTCTC
>FVFTF6N01A86BR
AACCCAATGTAAGGAAGCTAAGAACCTTGAAAAGAGGATACCAGAATTCTC
Here are the parameters and result I'm getting online:
Blast4-request ::= {
body queue-search {
program "blastn",
service "plain",
queries bioseq-set {
seq-set {
seq {
id {
local id 26297
},
descr {
title "consensus",
user {
type str "CFastaReader",
data {
{
label str "DefLine",
data str ">consensus"
}
}
}
},
inst {
repr raw,
mol na,
length 17,
seq-data ncbi2na '8A3520F740'H
}
}
}
},
subject sequences {
{
id {
local id 26299
},
descr {
title "FVFTF6N01A86BR",
user {
type str "CFastaReader",
data {
{
label str "DefLine",
data str ">FVFTF6N01A86BR"
}
}
}
},
inst {
repr raw,
mol na,
length 51,
seq-data ncbi2na '0543B0A09C205F80228C520F74'H
}
}
},
algorithm-options {
{
name "EvalueThreshold",
value cutoff e-value { 1, 10, 1 }
},
{
name "UngappedMode",
value boolean FALSE
},
{
name "PercentIdentity",
value real { 0, 10, 0 }
},
{
name "HitlistSize",
value integer 100
},
{
name "EffectiveSearchSpace",
value big-integer 0
},
{
name "DbLength",
value big-integer 0
},
{
name "WindowSize",
value integer 0
},
{
name "DustFiltering",
value boolean FALSE
},
{
name "RepeatFiltering",
value boolean FALSE
},
{
name "MaskAtHash",
value boolean TRUE
},
{
name "MismatchPenalty",
value integer -3
},
{
name "MatchReward",
value integer 2
},
{
name "GapOpeningCost",
value integer 5
},
{
name "GapExtensionCost",
value integer 2
},
{
name "StrandOption",
value strand-type both-strands
},
{
name "WordSize",
value integer 7
}
},
format-options {
{
name "Web_JobTitle",
value string "consensus"
},
{
name "Web_BlastSpecialPage",
value string "blast2seq"
}
}
}
}
>lcl|30439 FVFTF6N01A86BR
Length=51
Sort alignments
for this subject sequence by:
E value
Score Percent identity
Query start
position Subject start position
Score = 24.7 bits (26), Expect = 2e-05
Identities = 17/18 (94%), Gaps = 1/18 (5%)
Strand=Plus/Plus
Query 1 GAGGAT-CCAGAATTCTC 17
|||||| |||||||||||
Sbjct 34 GAGGATACCAGAATTCTC 51
Here's the output from a local search (I changed the expect to 5.0 just
to prove to myself that some parameters are getting through OK):
my @params = (-program => 'blastn', -outfile => 'bl2seq.out', -FILTER =>
'F', -WORDSIZE => 7, -expect => 5.0);
my $factory = Bio::Tools::Run::StandAloneBlast->new(@params);
my $bl2seq_report = $factory->bl2seq($cons_seqobj, $single_seqobj);
#consensus vs. FVFTF6N01A86BR
print Dumper $bl2seq_report->next_result;
$VAR1 = bless( {
'_inclusion_threshold' => undef,
'_queryacc' => 'adapter_consensus',
'_iteration_index' => 0,
'_iteration_count' => 1,
'_hits' => [],
'_hitindex' => 0,
'_querylength' => '17',
'_querydesc' => '',
'_iterations' => [
bless( {
'_oldhits_not_below_threshold' => [],
'_newhits_unclassified' => [],
'_number' => 1,
'_oldhits_newly_below_threshold' => [],
'_hit_factory' => bless( {
'interface' => 'Bio::Search::Hit::HitI',
'type' => 'Bio::Search::Hit::BlastHit',
'_loaded_types' => {
'Bio::Search::Hit::BlastHit' => 1
},
'_root_verbose' => 0
},
'Bio::Factory::ObjectFactory' ),
'_newhits_below_threshold' => [
{
'-algorithm' => 'BLASTN',
'-description' => '',
'-length' => '51',
'-query_len' => '17',
'-hsp_factory' => bless( {
'interface' => 'Bio::Search::HSP::HSPI',
'type' => 'Bio::Search::HSP::GenericHSP',
'_loaded_types' => {
'Bio::Search::HSP::GenericHSP' => 1
},
'_root_verbose' => 0
}, 'Bio::Factory::ObjectFactory' ),
'-name' => 'FVFTF6N01A86BR',
'-rank' => 1,
'-hsps' => [
{
'-query_start' => '7',
'-algorithm' => 'BLASTN',
'-hit_seq' => 'ccagaattctc',
'-hit_length' => '51',
'-query_length' => '17',
'-query_desc' => '',
'-query_frame' => 0,
'-rank' => 1,
'-hit_desc' => '',
'-query_end' => '17',
'-hit_name' => 'FVFTF6N01A86BR',
'-identical' => '11',
'-query_name' => 'adapter_consensus',
'-evalue' => '1e-04',
'-score' => '11',
'-conserved' => '11',
'-hit_frame' => 0,
'-hsp_length' => '11',
'-query_seq' => 'ccagaattctc',
'-hit_start' => '41',
'-homology_seq' => '|||||||||||',
'-hit_end' => '51',
'-bits' => '22.3'
},
{
'-query_start' => '9',
'-algorithm' => 'BLASTN',
'-hit_seq' => 'agaattct',
'-hit_length' => '51',
'-query_length' => '17',
'-query_desc' => '',
'-query_frame' => 0,
'-rank' => 2,
'-hit_desc' => '',
'-query_end' => '16',
'-hit_name' => 'FVFTF6N01A86BR',
'-identical' => '8',
'-query_name' => 'adapter_consensus',
'-evalue' => '0.007',
'-score' => '8',
'-conserved' => '8',
'-hit_frame' => 0,
'-hsp_length' => '8',
'-query_seq' => 'agaattct',
'-hit_start' => '50',
'-homology_seq' => '||||||||',
'-hit_end' => '43',
'-bits' => '16.4'
}
],
'-accession' => 'FVFTF6N01A86BR',
'-significance' => '1e-04'
}
],
'_root_verbose' => 0,
'_newhits_not_below_threshold' => [],
'_oldhits_below_threshold'
=> []
},
'Bio::Search::Iteration::GenericIteration' )
],
'_hit_factory' =>
$VAR1->{'_iterations'}[0]{'_hit_factory'},
'_statistics' => bless( {
'stats' => {
'S1' => '4',
'S1_bits' => '8.4',
'kappa_gapped'
=> '0.711',
'X3_bits' => '99.1',
'X1' => '4',
'lambda_gapped'
=> '1.37',
'X2' => '15',
'S2' => '4',
'seqs_better_than_cutoff' => '1',
'Hits_to_DB' => '5',
'num_extensions'
=> '2',
'num_successful_extensions' => '2',
'X1_bits' => '7.9',
'X3' => '50',
'dbentries' => '1',
'entropy_gapped'
=> '1.31',
'X2_bits' => '29.7',
'S2_bits' => '8.4'
}
},
'Bio::Search::GenericStatistics' ),
'_algorithm' => 'BLASTN',
'_parameters' => bless( {
'params' => {
'gapext' => '2',
'matrix' =>
'blastn matrix:1 -3',
'expect' => '5.0',
'allowgaps' =>
'yes',
'gapopen' => '5'
}
},
'Bio::Tools::Run::GenericParameters' ),
'_root_verbose' => 0,
'_queryname' => 'adapter_consensus'
}, 'Bio::Search::Result::BlastResult' );
More information about the Bioperl-l
mailing list