[Biopython-dev] GSoC python variant update

Lenna Peterson arklenna at gmail.com
Wed Aug 8 22:39:48 UTC 2012


On Wed, Aug 8, 2012 at 5:58 PM, Laurent Gautier <lgautier at gmail.com> wrote:
> On 2012-08-08 20:44, Lenna Peterson wrote:
>>
>> On Wed, Aug 8, 2012 at 12:37 PM, Laurent Gautier <lgautier at gmail.com>
>> wrote:
>>>
>>> On 2012-08-08 18:00, biopython-dev-request at lists.open-bio.org wro
>>>
>>>>>> * In order to customize the display of positions (e.g. 0-based or
>>>>>> 1-based), I'm using a class as a configuration container. I've read on
>>>>>> StackOverflow that attempts to use globals or a singleton class are
>>>>>> discouraged in Python, but I have not found practical suggestions for
>>>>>> how to implement module-wide configurations. Suggestions are welcome.
>>>
>>>
>>> Module-wide configuration can be implemented as variables, as long as
>>> they
>>> are declared before the functions using them.
>>> If considering a package rather than a single module, options can be
>>> stored
>>> in a module dedicated to options (since Python modules are singletons).
>>>
>> Hi Laurent,
>>
>> I really like the idea of a configuration module. I will definitely
>> move in that direction.
>>
>>>> With configuration items like this, you have two choices:
>>>>
>>>> - A global variable.
>>>> - Pass the configuration to every function that needs it.
>>>>
>>>> There are tradeoffs with both approaches, but for this case I agree with
>>>> your decision to use globals. Most people will want 0-based/Biopython
>>>> style but it gives those who don't a knob to switch over.
>>>
>>>
>>> I'd argue that allowing to switch is an invitation to spectacular issues
>>> down the road.
>>> An easy, yet frightening, example would be the case where using
>>> third-party
>>> code (such a module) changes this without you knowing.
>>>
>>> An other scary thought is that this would amount to bringing the infamous
>>> Perl variable "$[" to Python. Go explain again that folks should Python
>>> for
>>> its elegance and simplicity after that.
>>>
>>>
>> Yikes. My approach will not be comparable to $[. For starters, it
>> wouldn't modify the behavior of every sequence-like object.
>>
>> My current thought would be to store the 0-based position in an
>> attribute `pos`, have a property `pos_str` that returns `pos` +
>> `Config.index`. For representations, `__str__` will return `pos_str`,
>> and `__repr__` will return `pos` (always 0-based). Math would always
>> use the 0-based position.
>>
>> I intend to keep the influence of the hypothetical mapping Config
>> module limited to Biopython Seq* objects. It should also be possible
>> to make a kill switch, namely, a version of the Config module where
>> all of the settings are neutral to adding (i.e. `def __add__(self,
>> other): return other`).
>
>
> What about making the design decision that string representations that are
> 1-based then, and go beyond making a kill switch by just kill the switch ?
> You'd document it, folks that want 0-based positions would cook their own
> function(s).
>
> I think that configuration modules can be very useful for an application (an
> example here:
> http://flask.pocoo.org/snippets/2/ ), but I am more reserved about its use
> in a library.
>
> But do not let me stop you from pursuing this; I am only expressing an
> opinion. One last point though.
> Let me describe a possible scenario:
>
> 3rd-party module "foo" is using the Biopython Seq* part, and its author
> thinks that Config.index should at 1 one, so he/she sets it accordingly.
> An early line in foo.py is:
> from somewhere.in.biopython.seq import config
> config.index = 1
>
> There is an other piece of code (let's call it bar.py), written by someone
> else or by the same person at a different time. Now the hype is all about
> 0-based indexes, so the author sets it to be sure:
> from somewhere.in.biopython.seq import config
> config.index = 0
>
> To complete the scenario bar.py is using foo.py, or the other way around.
> The requirement for one an other does not even have to be direct. Now
> config.index will be what the last piece of code sets it to, although other
> parts of the code might assume it is set to something else.
>
> That sort of situation is not prevented from happening with any sort of
> module in Python (e.g., import sys; sys.stdout = sys.stderr), but people
> know they should not do it. Here the config.index would appear as something
> people should change if they like.
>
> Again, that's just an opinion. Others might differ.
>
> Best,
>
>
> Laurent
>
>
>>
>> Please let me know if this would not fully address your concerns.
>>
>> Cheers,
>>
>> Lenna
>
>


Laurent,

I must thank you again for your foresight. I am realizing I may have
gotten carried away with configurability. My initial goal with the
index setting was to enable both GenBank and HGVS representations of
genomic positions; a much simpler and safer approach would be to have
`to_genbank()` and `to_hgvs()` methods. A user could set the relevant
objects' __str__ to either of those.

Cheers,

Lenna



More information about the Biopython-dev mailing list