[Bioperl-l] LargeSeq performance
Stefan Kirov
skirov at utk.edu
Wed Oct 29 11:54:55 EST 2003
I have a problem with the performance of LargeSeq. I am working with
whole chromosomes (mouse, human) and next_seq takes forever.
I do not know if it is worth, since any portion can be read with random
access, but I am still curious to know id pepople think it might be a
good idea to create an object, that hadles extremely large sequences-
whole chromosomes for example without impact on the performance?
If you think it's worth I can try to do it. What I have in mind is use
grep to map the record separators ">" (in case you are mad enogh to put
more than one chromosome in a single file). Thus next_seq will know
where to look for the next sequence and, parse the id line and calc the
length. And I doubt anyone will use this under Windows (anyway, OS can
be checked to avoid problems). Also the object will use random
accessinstead of Bio::Root::IO to get sequence data.
Let me know what you think...
Stefan Kirov
More information about the Bioperl-l
mailing list