[Bioperl-l] search for only C-terminal degenerate motifs
Aaron J. Mackey
amackey at pcbi.upenn.edu
Fri Oct 10 16:08:26 EDT 2003
Yeah, the problem is you'd really like to *not* search all of the
database (for both speed and statistical reasons), only the first n
C-terminal residues of each sequence in the database. Both BLAST and
FASTA have mechanisms to allow you to focus the search on a given part
of the query, but not on a given part of the library. Unfortunately, I
think the best/quickest answer is to make your own database of
C-terminal sequences [perhaps using Bio::SeqIO and $seq->trunc()].
But you said "motif" - are you trying to find:
a) exact matches to a given short sequence
b) exact matches to a consensus regular expression (e.g.: CX[S|T]C)
c) near-exact, homologous matches using a query sequence and a
stringent, shallow scoring matrix
d) high scoring matches to a position-specific profile/weight matrix
Depending on the answer, you should be able to use current tools,
filter your output based on where hits occur, and then alter any
statistical results (P, E()) to accomodate your effectively reduced
search space.
-Aaron
On Friday, October 10, 2003, at 03:31 PM, Brian Osborne wrote:
> Lucas,
>
> I don't think a script exactly like this exists in Bioperl but
> certainly the
> means to write this script does. A good start might be the SearchIO
> HOWTO
> (bioperl.org/HOWTOs). The combination of SearchIO and StandAloneBlast
> (section III.4.5 of the bptutorial) enables you to run BLAST and parse
> the
> results. The details of the C-terminal match should be
> straightforward, I
> think, but it depends on your familiarity with Perl.
>
> Brian O.
>
> -----Original Message-----
> From: bioperl-l-bounces at portal.open-bio.org
> [mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Lucas Carey
> Sent: Thursday, October 09, 2003 6:48 PM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] search for only C-terminal degenerate motifs
>
> Hi,
> I'm not sure if this is the correct list, but I figured someone here
> would
> be able to help me.
> I'd like to use blast to search for a protein motif within a certain
> number
> of residues of the C-terminus. What is the best way to go about doing
> this?
> It seems like someone would have already had to do this, and written
> software to do it. That's what made me think of bioperl.
> -Lucas
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list