[Bioperl-l] Speed issues with making IUPAC consensus from alignment
Fields, Christopher J
cjfields at illinois.edu
Thu May 23 13:56:32 UTC 2013
(keep the list cc'd)
On May 22, 2013, at 6:31 PM, Senanu <senanu.junk at gmail.com> wrote:
> On May 22, 2013, at 4:17 PM, Fields, Christopher J wrote:
>
>> Hi all,
>>>
>>> I am wondering if the consensus_iupac method of Bio::Align is known to be extremely slow, or if I'm doing something wrong.
>>
>> Probably the former, but...
>>
>>> I have bacterial whole-genome alignments (~7 Mbases) that I made in progressiveMauve and wish to get an IUPAC consensus. (I know that progressiveMauve uses a non-standard XMFA format, but Bio::AlignIO seems to read them just fine.) The code below takes more than all night to make a consensus. It works fine on tiny test alignments.
>>
>> It shouldn't take that long, 7 Mb isn't that large. Or is that 7 Mb for one genome?
>
> It is 7Mb per genome, but there are only 2 genomes in the alignment, and the sequences are very similar to one another.
>
>>
>>> Is this a known problem? Is there another way to generate such a consensus?
>>
>> The code isn't really optimized for this, but again this isn't terribly large. Is the bottleneck reading the alignment in, or is it the consensus_iupac() step? Hard to say w/o seeing the alignment data itself.
>
> The bottleneck is definitely with the consensus_iupac step. Reading the alignment in takes a few seconds.
That's interesting, but again not surprising. One would have to look at the code, but I wouldn't be surprised if the method is terribly inefficient.
chris
More information about the Bioperl-l
mailing list