[Bioperl-l] Make edits to a large sequence
Kevin Brown
Kevin.M.Brown at asu.edu
Tue Jun 28 15:23:29 UTC 2011
An array might work, or just hold the whole thing in a string and use
substr on that rather than the BioPerl objects.
while (my $seq = $in->next_seq()) {
my sequence = $seq->seq;
substr($sequence,2,1,'c');
substr($sequence,8,1,'t');
$seq->seq($sequence);
...
}
Just remember that substr works on a 0 indexed string rather than a 1
indexed. So the 3rd position is 2 rather than 3.
> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of wannymahoots
> Sent: Tuesday, June 28, 2011 5:46 AM
> To: bioperl-l at bioperl.org
> Subject: [Bioperl-l] Make edits to a large sequence
>
> Hi,
>
> I'm looking for the quickest / most efficient way to make many edits
> (mutations) to a long fasta sequence using bioperl. The sequences are
> of the order of 200Mb long, and I would like to make 1,000s of changes
> to single bases (e.g. A->T at position 1,000, G->C at position 1,201
> etc.). The only way I've come across to do this is reading in the
> sequence and then making edits using SeqUtils, so something like:
>
> my $in = Bio::SeqIO->new('-file' => "file.fa", '-format' => "fasta");
>
> while(my $seq = $in->next_seq()) {
> my $mut = Bio::LiveSeq::Mutation->new(-seq => 'c',-pos => 3);
> Bio::SeqUtils->mutate($seq,$mut);
> }
>
> However, I'm concerned that this might be making multiple copies of
> the large sequence, and that using substr (which is how mutate works),
> is perhaps not the most efficient. Would it be better to save the
> fasta sequence as an array and change individual array positions
> directly?
>
> Many thanks for any advice.
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list