[BioPython] Rethinking Seq objects
Michiel Jan Laurens de Hoon
mdehoon at ims.u-tokyo.ac.jp
Sat May 7 01:36:01 EDT 2005
Gavin Crooks wrote:
> On May 5, 2005, at 00:30, Michiel Jan Laurens de Hoon wrote:
>>> If, in the alternative, Seq was a simple immutable object then it
>>> could be implemented as a light weight subclass of str, with an
>>> alphabet attribute that is also a subclass of str. You'd edit it like
>>> you would edit any string in python; split it into a list, do
>>> whatever manipulations are necessary, and then join the list back
>>> together into a new Seq.
>>
>> There may be performance issues with this approach, if a Seq object is
>> mutated often. So let's wait and see if any of our users actually want
>> to mutate a sequence object, and if so, if the performance is critical.
>
> Performance would be no worse than for string manipulation in standard
> python. The Way of The Python is not to use MutableString's (Which are
> in the standard library, but not really canonical) but to split string
> into lists or arrays, do whatever manipulations are necessary and then
> join the string back together. Is there any reason why Seq's can't be
> mutated analogously?
>
Well, I was gonna say that Seq objects can be very large, certainly much larger
than common usage of strings in Python, and that this will be a performance
issue. But when I tried to modify a long string by splitting and rejoining, it
doesn't seem to be bad at all. So maybe this is the way to go.
--Michiel.
--
Michiel de Hoon, Assistant Professor
University of Tokyo, Institute of Medical Science
Human Genome Center
4-6-1 Shirokane-dai, Minato-ku
Tokyo 108-8639
Japan
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon
More information about the BioPython
mailing list