[Bioperl-l] 2 psi-blast questions

Hadas Leonov hleonov at pob.huji.ac.il
Thu Jan 29 09:06:17 EST 2004


I wasn't clear enough :-)
I don't need to change the identity\length parameters, I check them as
I got over the report and decide which hit (hsp actually) is relevant
for me.
I know the e-value doesn't go below 0. the problem is that the
psi-blast parser does not consider a hit as "new" if it's e-value is
precisely 0.0 (or e-180 for that matter).
I went around this problem in a different manner.
I didn't quite get what the -h parameter does. what inclusion threshold
is  defined by it?
-e is for changing the upper limit of the e-value, right?

Thanks again,
Hadas.




On Jan 29, 2004, at 3:42 PM, Donald G. Jackson wrote:

> Hadas,
>
> the answer is 'yes and no'.
>
> You can set the -h value ($blaster->h()) to adjust the inclusion
> threshold for refining the scoring matrix.  However, it only takes
> expect values - not identity/length - and the expect value can't go
> below zero.  Lower e values are more significant; e=0 roughly means
> there isn't a snowball's chance in the netherworld of getting that hit
> by chance in a dataset the size of the one you searched.
> You can use the Bio::SearchIO::Blast parser's hit_filter and
> hsp_filter methods to pass in a coderef that rejects hits above/below
> a certain identity and length.
> If you do so, make sure you call identity on the HSP result objects,
> NOT the Hit objects as the identity numbers from Hit are inaccurate (a
> BLAST issue, not a SearchIO issue).
> I find that running PsiBlast takes some tweaking - adjusting the
> values for -e, -h, and -j (# rounds) so I build a good matrix and the
> search doesn't converge.  Also look at -b and -v  (#
> alignments/descriptions returned) to make sure you go deep enough into
> the results.
>
> Hope this helps,
>
> Don Jackson
>
> Hadas Leonov wrote:
>
>> Hi,
>> I noticed that a psi-blast report only considers new hits if the 0.0
>> < E-value < 0.01 (?).
>> Is there any possibility to change these parameters ? especially the
>> 0.0 limit, cause large query sequences with 60% identity might still
>> give an e value of 0.0, thereby causing me to miss them completely by
>> the standard (and comfortable) method of going through a psi blast
>> report:
>>
>> ...
>> $result = $psi_report->round($iter);
>> $newHits_ref = $result->newhits;
>> HIT: while($hit = $result->nextSbjct) {
>> my $hitName = $hit->name;
>> $is_new = grep /\Q$hitName\E/, @{$newHits_ref};
>> unless ($is_new ) {next HIT;}
>> #do something with the new hit
>>
>> }
>> ...
>>
>> furthermore, the line : $hit = $result->nextSbjct
>> sometimes causes the following error message to be displayed -
>> -------------------- WARNING ---------------------
>> MSG: Possible error (2) while parsing BLAST report!
>> ---------------------------------------------------
>> but i don't seem to find anything wrong with the actual parsing
>> result.
>>
>> any ideas?
>>
>> Thanks,
>> Hadas.
>>
>> ----------------------------------------------------------------------
>> --
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>



More information about the Bioperl-l mailing list