[Bioperl-l] question of parsing

Ewan Birney birney@ebi.ac.uk
Mon, 6 Aug 2001 22:58:24 +0100 (BST)


On Mon, 6 Aug 2001, Cheng-Yuan Kao wrote:

> Hi, there,
>  
> I have a question of parsing the result of blast search.
> Say I use some ESTs (related to some specific gene family)
> to blastx nr database, and then I got thousands of hits for
> these queries, how can I evaluate the hits? I would like
> to discard non-significant hits and ESTs representing known
> genes of this gene family?
>  

I'm afraid you will have to script this yourself, but by using
Bio::Tools::BPLite this should be pretty easy to write - the hard thing is
to figure out what to write. 

something like:


use Bio::Tools::BPlite;


my $report = new Bio::Tools::BPlite(-fh=>\*STDIN);

 SUBJ :
 
 while(my $sbjct = $report->nextSbjct) {
    foreach $hsp ( $sbjct->nextHSP ) {
        if( $hsp->length > 100 && $hsp->percent > 95 ) {
             next SUBJ; # too close, probably gene family
        }
    }
    # else it is significant. Not significant enough - you decide!

 }           

>  
> Much appreciate.
>  
> Richie
>  
> GGG,UC Davis
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
> 

-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney@ebi.ac.uk>. 
-----------------------------------------------------------------