[Biopython-dev] [biopython-dev] SeqFeature comparison for equality

Joshua Ismael Haase Hernández hahj87 at gmail.com
Mon Oct 17 17:57:53 UTC 2011


El 17 de octubre de 2011 12:15, Peter Cock <p.j.a.cock at googlemail.com>escribió:

> Hi Joshua,
>
> Could you CC the biopython-dev mailing list, unless you
> specifically want to discuss something in private?
>

Sorry about that, I thought i was answering to mailin list.

>
> 2011/10/17 Joshua Ismael Haase Hernández <hahj87 at gmail.com>:
> > I'm on it.
> >
> > Will add __eq__ to FeatureLocation on trunk.
>
> Great.
>
> In the short term, you can just work on it directly with a copy of the
> official repository and send me a patch (use git patch > file.patch)
>
> The "best" way is to fork biopython on github, and create your
> own branch with these changes.
>
> > I think BeforeLocation should check if the second is before,
> > After check if it is after, etc, and this can be done in locations.
> >
> > Before I implement those: do you agree?
> >
> > In that case, AbstractLocation instances
> > should check if ExactLocation instances are
> > inside their range, and AbstractLocation
> > instances to be exactly the same.
>
>
This positions would be the same:

OneOfPosition(5, 11, 15),
ExactPosition(11),
AfterPosition(4),
BeforePosition(16),
WithinPosition(5, 16),


> No. Having tried this myself, it is very complicated.
>

I think I'm missing something, why is it hard?,
I see it as a cases listing.


> Also, there are constraints with the Python language
> about equality, hashing and comparisons (e.g. for
> membership in lists, or use as dictionary keys).
>

I don't think anyone should use Features as dictionary keys,
they will use Feature Id for that, but maybe someona wants a
set of features (which just now is like a list of all sequences)...

I which cases that should be a problem? (I'm biothechnology
engineer, so I don't see all caveats, and i don't really have
deep undestanding about how python works)

The current behaviour of simple comparison of
> the positions as an integer is at least simple.
>
> > About SeqFeature, I think they should be
> > the same if they share all locations.
>
> You don't care about feature type and ID?  ;)
>

maybe not, a comparison could skip iterating
the locations if we have the same type and id,
still not sure that's a good method (thus the comment
«# Can we trust this?» on my patch) but a feature
'CDS' is sometimes equivalent to feature 'mRNA',
in that case ID and type would both be different
in seqfeatures.

>
> Peter
>




More information about the Biopython-dev mailing list