[Biopython-dev] Bio.AlignIO, Bio.Nexus, MrBayes, polymorphic sites, maximum line length

Peter biopython at maubp.freeserve.co.uk
Thu Dec 2 15:55:20 UTC 2010


On Thu, Dec 2, 2010 at 3:25 PM, Nick Loman <n.j.loman at bham.ac.uk> wrote:
> Peter wrote:
>>>
>>> Is this the best way of doing it? Would a method call in AlignIO to
>>> do the same thing be useful to others?
>>>
>>
>> I've got some code somewhere for iterating over the columns of
>> the alignment, and think I filed an enhancement bug for this.
>> Would that do what you want?
>>
>
> Hi Peter,
>
> Yes, that would make the code more readable, definitely. Not sure whether
> you think a function to return an alignment containing just the polymorphic
> sites would also be helpful to others.
>

I suspect it wouldn't be of general interest.

>>> 2) When outputting long alignments in Nexus format, MrBayes refuses
>>> to read the resulting files saying that the maximum line length is 19900
>>> characters.
>>> I'm assuming that is not the maximum input to MrBayes and that it can
>>> handle longer alignments if they are split in some way. Would it be
>>> possible for Bio.Nexus to split alignments in the appropriate format?
>>>
>>
>> Are you outputting the large alignment using Bio.AlignIO or using
>> Bio.Nexus directly?
>>
>
> In this case I was using Bio.Nexus but it would be the same with
> Bio.AlignIO.
>

Did you ask Bio.Nexus to write interleaved output?

I've got MrBayes 3.1.2, and this seems to fix your example:

diff --git a/Bio/AlignIO/NexusIO.py b/Bio/AlignIO/NexusIO.py
index 72550b1..c3b1649 100644
--- a/Bio/AlignIO/NexusIO.py
+++ b/Bio/AlignIO/NexusIO.py
@@ -107,7 +107,7 @@ class NexusWriter(AlignmentWriter):
         n.alphabet = alignment._alphabet
         for record in alignment:
             n.add_sequence(record.id, record.seq.tostring())
-        n.write_nexus_data(self.handle)
+        n.write_nexus_data(self.handle, interleave=True)

     def _classify_alphabet_for_nexus(self, alphabet):
         """Returns 'protein', 'dna', 'rna' based on the alphabet (PRIVATE).


Does that work for you?

Peter



More information about the Biopython-dev mailing list