[Bioperl-l] search for only C-terminal degenerate motifs

Aaron J. Mackey amackey at pcbi.upenn.edu
Fri Oct 10 16:08:26 EDT 2003


Yeah, the problem is you'd really like to *not* search all of the 
database (for both speed and statistical reasons), only the first n 
C-terminal residues of each sequence in the database.  Both BLAST and 
FASTA have mechanisms to allow you to focus the search on a given part 
of the query, but not on a given part of the library.  Unfortunately, I 
think the best/quickest answer is to make your own database of 
C-terminal sequences [perhaps using Bio::SeqIO and $seq->trunc()].

But you said "motif" - are you trying to find:

   a) exact matches to a given short sequence
   b) exact matches to a consensus regular expression (e.g.: CX[S|T]C)
   c) near-exact, homologous matches using a query sequence and a 
stringent, shallow scoring matrix
   d) high scoring matches to a position-specific profile/weight matrix

Depending on the answer, you should be able to use current tools, 
filter your output based on where hits occur, and then alter any 
statistical results (P, E()) to accomodate your effectively reduced 
search space.

-Aaron

On Friday, October 10, 2003, at 03:31 PM, Brian Osborne wrote:

> Lucas,
>
> I don't think a script exactly like this exists in Bioperl but 
> certainly the
> means to write this script does. A good start might be the SearchIO 
> HOWTO
> (bioperl.org/HOWTOs). The combination of SearchIO and StandAloneBlast
> (section III.4.5 of the bptutorial) enables you to run BLAST and parse 
> the
> results. The details of the C-terminal match should be 
> straightforward, I
> think, but it depends on your familiarity with Perl.
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Lucas Carey
> Sent: Thursday, October 09, 2003 6:48 PM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] search for only C-terminal degenerate motifs
>
> Hi,
> I'm not sure if this is the correct list, but I figured someone here 
> would
> be able to help me.
> I'd like to use blast to search for a protein motif within a certain 
> number
> of residues of the C-terminus. What is the best way to go about doing 
> this?
> It seems like someone would have already had to do this, and written
> software to do it. That's what made me think of bioperl.
> -Lucas
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list