<div dir="ltr"><div>Hello Jan,</div><div><br></div><div>Yes, we have talked about this - see e.g. <a href="https://github.com/biopython/biopython/issues/897">https://github.com/biopython/biopython/issues/897</a></div><div><br></div><div>That has a couple of workarounds, but perhaps you'd like to comment on if we need biological start/end properties as well, and how you would name them?<br></div><div><br></div><div>This probably depends on what you want to use the values for - the main use case I can think of is extracting the described sequence which is handled for you via the extract method.</div><div><br></div><div>For other usages like drawings and finding overlaps, I think the current left/right style start/end are more useful.<br></div><div><br></div><div>Peter<br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Aug 31, 2023 at 6:42 AM Jan T. Kim <<a href="mailto:jttkim@googlemail.com">jttkim@googlemail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi All,<br>
<br>
I've recently encountered features in circular sequences that start near<br>
the end of the (probably arbitrarily) linearised sequence and end near<br>
its start. For an example see the first CDS feature in [1] (locus tag<br>
"X600_gp001"):<br>
<br>
join(139629..139738,1..196)<br>
<br>
To my surprise, the start attribute of this feature's location is 0,<br>
and its end attribute is the end of the sequence:<br>
<br>
>>> f1.location.start<br>
ExactPosition(0)<br>
>>> f1.location.end<br>
ExactPosition(139738)<br>
<br>
So by using the start and end positions of the feature, without checking<br>
whether its location is compound and going through the parts in this<br>
case, it appears that the feature is comprised of the entire sequence (!!).<br>
<br>
Technically, the findings above are consistent with the documentation which<br>
states that start and end give the minimal and maximal positions occurring in<br>
a feature, respectively.<br>
<br>
This behaviour is not quite consistent with my expectations in this case,<br>
however. Is there any way (attribute, method or whatever) to detect whether<br>
a feature straddles the cut point of a circular sequence? I realise that<br>
when taking non-exact positions into account and when making no assumptions<br>
about the ordering of parts, such a check can be difficult and may not<br>
have a well defined result in all cases, but on the other hand I don't<br>
think it's likely that I'm the first person requiring such a check...?<br>
<br>
My main objective with this post is to find out whether there's anyting<br>
in Biopython that does this type of job already. If there isn't I'll<br>
code up some heuristic.<br>
<br>
Best regards, Jan<br>
<br>
<br>
[1] <a href="https://www.ncbi.nlm.nih.gov/nuccore/NC_022920.1/" rel="noreferrer" target="_blank">https://www.ncbi.nlm.nih.gov/nuccore/NC_022920.1/</a><br>
<br>
_______________________________________________<br>
Biopython mailing list - <a href="mailto:Biopython@biopython.org" target="_blank">Biopython@biopython.org</a><br>
<a href="https://mailman.open-bio.org/mailman/listinfo/biopython" rel="noreferrer" target="_blank">https://mailman.open-bio.org/mailman/listinfo/biopython</a><br>
</blockquote></div>