[Biopython-dev] SeqIO Abi Parser
p.j.a.cock at googlemail.com
Fri Jul 29 16:20:23 UTC 2011
I had a bit of time this afternoon so I looked at this.
On Fri, Jul 29, 2011 at 1:14 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> On Fri, Jul 29, 2011 at 12:34 PM, Wibowo Arindrarto wrote:
>> Hi Peter,
>> Thanks for explaining. I understand why we should stick to the stored
>> sequence id. In this case, we can use the filename as SeqRecord.name as
>> well. Regarding BioPerl, I don't have it installed myself -- but I took a
>> quick look at their source and it seems they also use the stored sequence ID
>> as their main identifier instead of the filename. If the stored sequence ID
>> is not present, it's "(unknown)" in their case.
> OK good, that means Biopython, BioPerl and EMBOSS should be
> consistent :)
I've made that switch,
>> I'll look on the test_SeqIO.py over the weekend. I think it'll have
>> something to do with some ambiguous dna base stored in the abi files.
> Some of the alphabet stuff is a bit nasty - so please feel free to ask
> or get me to help.
I've done enough to get the test_SeqIO.py unit test to pass.
We probably need a check (like in SFF) to check the user hasn't given
a handle opened in text mode. That should probably have a unit test
I still haven't cross checked the sequence and PHRED scores from
your code and EMBOSS.
Anyway - I'll leave the code for you to work on for now...
More information about the Biopython-dev