[Biopython] Fw: Hiding alphabets

Peter Cock p.j.a.cock at googlemail.com
Tue Jul 3 13:31:18 UTC 2018


Thank you to everyone who has commented so far on the issue:

https://github.com/biopython/biopython/issues/1674

We do not have consensus on what to do with the alphabets
as yet, and therefore if we should hide them or not.

However, I am proposing to hide the default alphabet from the
Seq objects' __repr__ as implemented here:

https://github.com/biopython/biopython/pull/1676

I originally suggested this for Biopython 1.72 (which is now out).

Peter

On Tue, Jun 5, 2018 at 12:19 PM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
> Dear Biopythoneers,
>
> I know that Michiel has long expressed a wish to get rid of
> the current alphabet system in Biopython, and I agree that
> the historic design is overly complicated and often gets in
> the way - but we don't have any concrete proposals to replace
> it. Part of the problem here is coming up with a replacement
> with the least painful transition - and being practical the less
> people use the alphabets, the less trouble any changes would
> cause.
>
> The proposal here would de-emphasis the use of alphabets,
> reflecting the fact that for the vast majority of scripts and
> code you can just ignore them.
>
> There are still corner cases - for example, for some of the
> SeqIO output filetypes we currently need to use the Seq's
> alphabet to label the sequence type (RNA, DNA, protein).
>
> Still, overall I can see it being quite practical to downplay
> the alphabet objects in our user facing documentation,
> and hiding them in the Seq objects' __repr__ helps there.
>
> Is this a case where in the Zen of Python where practicality
> wins out over being explicit about what a sequence object
> contains?
>
> "Explicit is better than implicit.
> ...
> Although practicality beats purity."
>
> https://www.python.org/dev/peps/pep-0020/
>
> Thoughts and comments welcome here on on the issue,
> https://github.com/biopython/biopython/issues/1674
>
> Peter
>
> On Sun, Jun 3, 2018 at 1:02 PM, Michiel de Hoon <mjldehoon at yahoo.com> wrote:
>> Dear all,
>>
>> I have opened an issue here:
>> https://github.com/biopython/biopython/issues/1674
>> in case anybody has any comments or suggestions.
>>
>> Best,
>> -Michiel
>>
>>
>>
>> On Saturday, May 26, 2018 11:50 PM, Michiel de Hoon <mjldehoon at yahoo.com>
>> wrote:
>>
>>
>> Dear all,
>>
>>
>>
>> In Biopython, Seq objects show both their sequence content and the alphabet
>> associated with them.
>> For example, the first example in our Biopython Tutorial & Cookbook starts
>> as follows:
>>
>>>>> from Bio.Seq import Seq
>>>>> my_seq = Seq("AGTACACTGGT")
>>>>> my_seq
>> Seq('AGTACACTGGT', Alphabet())
>>
>> I don't think we need to show the alphabet here. It takes up screen space,
>> and oftentimes it's uninformative (as in the example above); the other
>> examples in the same section of the tutorial show SingleLetterAlphabet and
>> IUPACAmbiguousDNA. Even in the latter case, I don't think users need to be
>> reminded every time that they are dealing with DNA.
>>
>> Perhaps more importantly, this is very confusing for new users. I would say
>> that alphabets are of minor importance in Biopython overall. Some might say
>> that they should be abolished altogether. But if we start off our tutorial
>> by showing Alphabet, IUPAC.unambiguous_dna, SingleLetterAlphabet etc., then
>> a reasonable question from students would be what they are and why we use
>> them. I don't have a good answer to that question.
>> In addition, the design of the Alphabet class is problematic.
>>
>> Shall we change the __repr__ function of Seq objects to show the sequence
>> only? I.e. the example above would show
>>
>>>>> from Bio.Seq import Seq
>>>>> my_seq = Seq("AGTACACTGGT")
>>>>> my_seq
>> Seq('AGTACACTGGT')
>>
>> Then the section on alphabets in the Tutorial can move to the end of the
>> chapter, for people who actually want to use Alphabets.
>>
>> For each sequence object, the alphabet would still be accessible as the
>> attribute to the Seq object:
>>
>>>>> my_seq.alphabet
>> Alphabet()
>>
>>
>> Best,
>> -Michiel
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>> http://mailman.open-bio.org/mailman/listinfo/biopython


More information about the Biopython mailing list