[Bioperl-l] Setting Theoretical Database size for bl2seq
khoueiry
khoueiry at ibdm.univ-mrs.fr
Wed Oct 26 12:44:05 EDT 2005
Tembe,
I made a fast search and found the following:
I have bl2seq installed on my machine and thus, making a simple "man
bl2seq" gave me an idea about the parameters and it says that -d
correspond to :
"-d N (bl2seq)--- Use theoretical DB size of N (zero stands for
the real size)"
a fast search in google gave me a similar result that you can find in
this link ' http://hits.isb-sib.ch/doc/motif_score.shtml'. Briefly, they
say that, when calculating an E-value, specialy when converting from a
normalized score, you have to take the database size in residues.
So, I think that in your case, it will correspond to the "length of
database: 12,254,801,043".
I hope this is the fine answer, and hope that others will give you more
details if possible.
Pierre
On Wed, 2005-10-26 at 11:12 -0400, Waibhav Tembe wrote:
> Hello List,
>
> This is not a BioPerl question, but I could not find a satisfactory answer
> from other sources and would appreciate any help.
>
> I am trying to use bl2seq for comparing query "q" and another genome "g".
> Now, for "q" I already have blastall output from an nt database
> containing >2 million
> sequences. I understand that to get compatible e values, I need to set
> -d parameter
> for bl2seq to the theoretical data size of that nt database. Which
> number from
> the following 4 (taken from blastall output) should be used for -d ?
>
> length of database: 12,254,801,043
> effective length of database: 12,167,805,299
> effective search space: 48671221196
> effective search space used: 48671221196
>
> Any pointers/website/docs will be appreciated.
>
> Thank you.
>
> Tembe
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list