[Biopython-dev] SeqIO Abi Parser

Peter Cock p.j.a.cock at googlemail.com
Fri Jul 29 16:20:23 UTC 2011


Hi again,

I had a bit of time this afternoon so I looked at this.

On Fri, Jul 29, 2011 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Fri, Jul 29, 2011 at 12:34 PM, Wibowo Arindrarto wrote:
>> Hi Peter,
>> Thanks for explaining. I understand why we should stick to the stored
>> sequence id. In this case, we can use the filename as SeqRecord.name as
>> well. Regarding BioPerl, I don't have it installed myself -- but I took a
>> quick look at their source and it seems they also use the stored sequence ID
>> as their main identifier instead of the filename. If the stored sequence ID
>> is not present, it's "(unknown)" in their case.
>
> OK good, that means Biopython, BioPerl and EMBOSS should be
> consistent :)

I've made that switch,

>> I'll look on the test_SeqIO.py over the weekend. I think it'll have
>> something to do with some ambiguous dna base stored in the abi files.
>> Regards,
>
> Some of the alphabet stuff is a bit nasty - so please feel free to ask
> or get me to help.

I've done enough to get the test_SeqIO.py unit test to pass.

We probably need a check (like in SFF) to check the user hasn't given
a handle opened in text mode. That should probably have a unit test
too.

I still haven't cross checked the sequence and PHRED scores from
your code and EMBOSS.

Anyway - I'll leave the code for you to work on for now...

Peter



More information about the Biopython-dev mailing list