[Bioperl-l] Script
Andrew Walsh
walsh at cenix-bioscience.com
Tue Sep 23 03:59:11 EDT 2003
This isn't a pure Bioperl implementation, but it should do the trick:
# assume you have fasta file with your seqs
my $seqio = Bio::SeqIO->new(-file => 'my_file.fasta');
my $count = 0;
while (my $seq = $seqio->next_seq) {
while ($seq->seq =~ /NCCC/g) {
$count++;
}
}
print "Found motif $count times\n";
If you need to have 2 or more amino acids possible at 1 position, then
use [] in your regex.
e.g. to match NCCC and NDCC, use /N[CD]CC/g
Maybe someone else out there knows of a Bioperl module that would also
do this.
Cheers,
Andrew
Lobvi Matamoros wrote:
>
>
> Hi to everyone:
>
> I am trying to know how many times a particular amino acid motif occur
> in a protein database, for instance NCCC, in other words count that
> particular motif. Does anyone have an script to perform that task or
> something close I can change a little bit?.
>
> Thanks for your help in advance
>
> Lobvi Matamoros Fernández, Ph.D
> Post-doctoral fellow
>
> Centre de Recherche du CHUL
> 2705 Boul. Laurier, T3-80
> Sainte-Foy (Québec)
> G1V 4G2 CANADA
> Tel: 418-6542261
> FAX:418-654-2279
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
------------------------------------------------------------------
Andrew Walsh, M.Sc.
Bioinformatics Software Engineer
IT Unit
Cenix BioScience GmbH
Pfotenhauerstr. 108
01307 Dresden, Germany
Tel. +49(351)210-2699
Fax +49(351)210-1309
public key: http://www.cenix-bioscience.com/public_keys/walsh.gpg
------------------------------------------------------------------
More information about the Bioperl-l
mailing list