[Bioperl-l] extract nonoverlapping subsequences from a whole genome
gopu_36
gopu_36 at yahoo.com
Tue Apr 10 07:42:26 UTC 2007
Hi,
I am one of the newbee venturingout bioperl for my research purposes. I have
a whole genome sequence of a pathogen. I am trying to break them into
non-overlapping 1000bps subsequences. For example if my whole genome
sequence is 400000 bps length, then I should be beak them into 4000
subsequences of each 1000 bps and they should be non-overlapping but at the
same time continous. To be precise, my first substring would be from 1 to
1000 bps, second substing would be from 1001 to 2000 etcc.. Could anyone
help me.
I tried with the following code but it gives me only the first substring and
rest are not! I would appreciate very much if someone could help me!
.........
.
.
my $start =1;
my $finish =100;
my $inseq = Bio::SeqIO->new(-file => "$in_file");
while( my $seq = $inseq->next_seq ) {
my $cleseq = $seq->seq();
$seqlength = $seq->length();
if ($finish<$seqlength){
print "The length of the sequence is $seqlength\n";
my $ordseq = $cleseq->subseq($start,$finish);
push(@seq_array,$ordseq);
$start=+100;
$finish=+100;
$counter++;
next;
}
}
--
View this message in context: http://www.nabble.com/extract-nonoverlapping-subsequences-from-a-whole-genome-tf3551560.html#a9915265
Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
More information about the Bioperl-l
mailing list