[Bioperl-l] extract nonoverlapping subsequences from a whole genome
Chris Fields
cjfields at uiuc.edu
Tue Apr 10 20:22:15 UTC 2007
There is a script in the BioPerl scripts directory which does this,
with optional overlaps (split_seq.PLS). It's in /scripts/seq.
chris
On Apr 10, 2007, at 2:42 AM, gopu_36 wrote:
>
> Hi,
> I am one of the newbee venturingout bioperl for my research
> purposes. I have
> a whole genome sequence of a pathogen. I am trying to break them into
> non-overlapping 1000bps subsequences. For example if my whole genome
> sequence is 400000 bps length, then I should be beak them into 4000
> subsequences of each 1000 bps and they should be non-overlapping
> but at the
> same time continous. To be precise, my first substring would be
> from 1 to
> 1000 bps, second substing would be from 1001 to 2000 etcc.. Could
> anyone
> help me.
> I tried with the following code but it gives me only the first
> substring and
> rest are not! I would appreciate very much if someone could help me!
> .........
> .
> .
> my $start =1;
> my $finish =100;
> my $inseq = Bio::SeqIO->new(-file => "$in_file");
> while( my $seq = $inseq->next_seq ) {
>
> my $cleseq = $seq->seq();
>
> $seqlength = $seq->length();
> if ($finish<$seqlength){
> print "The length of the sequence is $seqlength\n";
> my $ordseq = $cleseq->subseq($start,$finish);
> push(@seq_array,$ordseq);
> $start=+100;
> $finish=+100;
> $counter++;
> next;
> }
> }
> --
> View this message in context: http://www.nabble.com/extract-
> nonoverlapping-subsequences-from-a-whole-genome-
> tf3551560.html#a9915265
> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
Christopher Fields
Postdoctoral Researcher
Lab of Dr. Robert Switzer
Dept of Biochemistry
University of Illinois Urbana-Champaign
More information about the Bioperl-l
mailing list