[Bioperl-l] Pattern recognition
Djodja
j.s.soares at gmail.com
Sat Oct 20 07:11:17 UTC 2007
I will have to give you the credit here as well!
THank you so much Chris.
I have declared both $position and $count_of_pat inside the while loop.
Will rearrange the script and I'm going to make it a subroutine as I have
many other patterns to search for.
Thank you thank you thankyou.
Also, I am in England in Exeter. If you want I could look on ways to get you
here for a nice talk for you to expose your work.
Let me know.
Djodja
Chris Hemmerich wrote:
>
>
>
> Djoja,
>
> Perhaps this is a problem with scope. Since you declare $position outside
> of
>
> while (my $seq_obj = $in->next_seq) {
>
> it will not be reset for each sequence. The line
>
> $position = $position + length($seq_obj) - $patlen + 1;
>
> will continue to increase $position for each sequence processed. From a
> quick look, I don't think you need $position at all in this script and can
> just print $-[0]+1.
>
> Hope this helps,
>
> Chris
>
>
> On Fri, 19 Oct 2007, Djodja wrote:
>
>>
>> Hi all,
>>
>> This is my first post. I am very puzzeled about a script that i wrote
>> last
>> year, that was working fine, and all of sudden, with no change to the
>> code
>> it just stopped doing what it was supposed to do. Basically I'm searching
>> for patterns on a whole genome database.
>>
>> The pattern that I'm searching for at the moment is AACAAAG.
>>
>> My script finds the pattern, no problem on regexps or anything. It also
>> searches the whole genome divided in fasta headers for each gene. The
>> problem is that I need to know the position of the pattern, relative to
>> the
>> start of each gene. This is where the script goes weird. It doesn't reset
>> the counter to the start of each gene. It instead continues to count
>> patterns as if the genes aren't separated at all. Also I need the
>> information on how many repetitions of the pattern there are in each
>> gene,
>> as above, it does not reset the counting of repetitions and instead gives
>> me
>> the repitions in the whole genome.
>>
>> here is the code:
>>
>> #!/usr/lib/perl -w
>>
>> use Bio::Perl;
>> use Bio::Seq ();
>> use Bio::SeqIO ();
>> use Bio::Tools::SeqPattern ();
>> use strict;
>> use warnings;
>>
>>
>>
>>
>> print 'Where is the file that you want to analise?';
>> chomp (my $Fasfile = <>) or die "No such file";
>> print "Where do you want to store the rough file?";
>> chomp (my $rough = <>);
>> print "Where do you want to store the final file?";
>> chomp (my $final = <>);
>>
>> my ($pat, $patlen, $count_of_pat, $position) = ("AACAAAG", 7, 0, 0);
>> my $pattern = new Bio::Tools::SeqPattern(-SEQ => $pat, -TYPE =>'Dna');
>>
>> if(my $in = Bio::SeqIO->new(-file => $Fasfile, -format => 'fasta' )) {
>> open (MOTIFCOUNT,">> $rough") or die "The gene list wasn't created";
>>
>>
>> while (my $seq_obj = $in->next_seq) {
>>
>> #my $id1 = $seq_obj->display_id();
>> my $seq_length = length($seq_obj);
>> #print "$seq_length\n";
>> foreach ($seq_obj->seq =~ m/$pat/g){
>> ++$count_of_pat;
>>
>> my $id1 = $seq_obj->display_id();
>> print $seq_obj->seq, "\n";
>> print MOTIFCOUNT "$id1\t$count_of_pat\tposition ",
>> $position + $-[0] + 1, "\n";
>> $position = $position + length($seq_obj) - $patlen + 1;
>> #print "$id1\n";
>> #print "$count_of_pat\n";
>> #print "$position\n";
>>
>> }
>>
>> #print "$seq_length\n";
>> print $seq_obj->seq, "\n";
>> }
>>
>>
>> close MOTIFCOUNT;
>>
>> open (SYM, "+< $rough") or die "The file doesn't exist";
>>
>> open (TAB, ">> $final") or die "The gene list wasn't transformed";
>>
>> print TAB "Motif: $pat\n";
>> while (my $line = <SYM>){
>>
>> foreach ($line){
>> $line =~ s/\|/\t/g;
>> print TAB "$line";
>> }
>> }
>> close (SYM);
>> close(TAB);
>>
>> print "The gene list was created and transformed
>> successfully.\n\a"
>> }
>> exit;
>>
>>
>>
>> Can anyone help me out? This is kind of in the way of my PhD progression.
>>
>> All the best,
>>
>> Djodja
>> --
>> View this message in context:
>> http://www.nabble.com/Pattern-recognition-tf4652923.html#a13293737
>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
View this message in context: http://www.nabble.com/Pattern-recognition-tf4652923.html#a13307407
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
More information about the Bioperl-l
mailing list