[Bioperl-l] BLAST parameters

Jason Stajich jason@cgt.mc.duke.edu
Fri, 9 Aug 2002 15:35:59 -0400 (EDT)


Yes I wrote RemoteBlast with HARD CODED global hash.
It was more of an exercise than a true flexible remote blasting module
because most systems are different.  We should really setup RemoteBlast as
a front end factory and subclass it for different types of RemoteBlast
engines (NCBI, EBI, your-favorite-system, local-blast stuff).  The Hard
coding should be removed and made to be picking up defaults but allowing
everything to be set dynamically in the object.

Mat Wiepert @mayo is looking at this, if someone else wants to help this
should be reasonably easy to do in a more generally way if you are happy
to CGI parameter debug.

-jason

On Fri, 9 Aug 2002, P B wrote:

> Hi Brian,
>
> Thanks!  It will be great to know how to change parameters like this.  My
> one question is: what is that line of code $Bio::...HEADER{'MATRIX_NAME'} =
> 'BLOSUM25' actually doing?  Is HEADER a hash in the RemoteBlast name-space?
> It's not a crucial point, but I like knowing what's actually going on as
> much as possible.
>
> Thanks again,
> Tats
>
> >From: "Brian Osborne" <brian_osborne@cognia.com>
> >To: "P B" <itatsumaki@hotmail.com>, <bioperl-l@bioperl.org>
> >Subject: RE: [Bioperl-l] BLAST parameters
> >Date: Fri, 9 Aug 2002 13:35:18 -0400
> >Return-Path: brian_osborne@cognia.com
> >X-OriginalArrivalTime: 09 Aug 2002 17:35:43.0969 (UTC)
> >
> >Tats,
> >
> >I just added this to bptutorial.pl, you might find it useful:
> >
> >You may want to change some parameter of the remote job and this example
> >shows how to change the matrix:
> >
> >$Bio::Tools::Run::RemoteBlast::HEADER{'MATRIX_NAME'} = 'BLOSUM25';
> >
> >For a description of the many CGI parameters see:
> >
> >http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
> >
> >
> >Brian O.
> >
> >
> >-----Original Message-----
> >From: bioperl-l-admin@bioperl.org [mailto:bioperl-l-admin@bioperl.org]On
> >Behalf Of P B
> >Sent: Friday, August 09, 2002 1:14 PM
> >To: bioperl-l@bioperl.org
> >Subject: [Bioperl-l] BLAST parameters
> >
> >Hi all, a newbie question I think.
> >
> >I haven't used bioperl before, so some of these questions might be a little
> >dumb, so flame away where needed.  Let me first give the goal, in case I'm
> >missing something conceptual here:
> >
> >Goal:
> >I have a long list of sequences (15,000) that I would like to identify.  In
> >particular, I want to find out what (rat) cluster they most likely
> >represent.
> >
> >Approach:
> >- submit genes one by one to remote BLAST (it's a lot of BLASTing so I'm
> >waiting 60 seconds between submissions (I do realize this will take 10
> >days,
> >btw, and I don't have access to a local BLAST)
> >- retrieve the BLAST results and parse out the top ten hits by e-value or
> >bit-score (undecided if there is a reason to prefer expectation values to
> >the normalized bit-scores?)
> >- for each of the top 10 hits, parse out the genbank accession
> >- use this accession to determine the corresponding cluster (I expect I
> >will
> >have to download the unigene .dat file to do this)
> >- if I can assign a conclusive identity to the sequence, great, if not
> >store
> >the results for future analysis
> >
> >I hope to be able to automatically identify 70-80% of the sequences using
> >selection criteria like:
> >2 top hits for same cluster
> >3 of the top 5 hits for same cluster
> >6 of the top 10 hits for same cluster
> >or something similar.  The assignations don't have to be perfect, just
> >reasonably close.
> >
> >Now, my (first) two problems involve submitting the BLAST to NCBI.  I'm
> >doing a test case with a 3-sequence FASTA file, btw.  What I would like is
> >to restrict my BLAST searches to "Rattus norvegicus" as you can on the NCBI
> >web-site under advanced options.
> >
> >In addition, I would like to be able to submit customized nucleotide
> >substitution matrices to use with the BLAST.
> >
> >That latter point isn't as critical, but I really would like to avoid
> >having
> >to get back a pile of BLAST hits and have to filter through non-rat hits if
> >possible.
> >
> >The RemoteBlast module accepts an @params array array to its ->new()
> >method,
> >but I don't know what to call these parameters that I would like to use.
> >
> >Any comments, suggestions, ideas are very much welcome.
> >Thanks in advance!
> >Tats
>
>
> _________________________________________________________________
> MSN Photos is the easiest way to share and print your photos:
> http://photos.msn.com/support/worldwide.aspx
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu