[Biopython] Updating feature Location in Seqfiles

Peter Cock p.j.a.cock at googlemail.com
Wed Nov 11 23:11:31 UTC 2020


Hi Franz,

That's something which I think catches out a lot of people learning Python.
I'm glad to hear that this seems to have fixed things.

Happy to have helped,

Peter

On Wed, Nov 11, 2020 at 11:04 PM F.L. Ratzkowski <flr28 at cam.ac.uk> wrote:

> Hi,
> so much I think I just solved it. Apparently i was sitting on a finished
> product,
> and was exactly the compareAfter compareBefore that was the problem. I am
> so happy after 2 days of struggling. So when I assigned the
> SeqRecord.feature = Compart1  i thought this was just a copy apparently it
> was still linked and Compared1 got also updated by the later modifications
> of SeqRecord. so that Compare2 = Compare1. now loading in the same
> seqrecord but on a new variable SecRecord2 as reference that didnt got
> updated could provide me all the hits.
> I hope you understand.
>
> Thanks for putting in your time and sorry for my bad spelling.
> Best wishes,
> Franz
>
> Franz L. Böge
> PhD Student, University of Cambridge
> Chin Group, MRC Laboratory of Molecular Biology
> Phone: +44 (0)1223 267604
>
>
>
>
> ------------------------------
> *From:* Peter Cock <p.j.a.cock at googlemail.com>
> *Sent:* 11 November 2020 22:41
> *To:* F.L. Ratzkowski <flr28 at cam.ac.uk>
> *Cc:* biopython at biopython.org <biopython at biopython.org>
> *Subject:* Re: [Biopython] Updating feature Location in Seqfiles
>
> Hi Franz,
>
> Perhaps this example will help explain what I meant?
>
> >>> old_list = ["A", "B", "C", "D"]
> >>> new_list = old_list
> >>> new_list[3] = "Z"
> >>> new_list == old_list
> True
> >>> old_list
> ['A', 'B', 'C', 'Z']
>
> In your example, CompareBefore, CompareAfter, and Seqfile.features are all
> the same list.
>
> Peter
>
>
> On Wed, Nov 11, 2020 at 8:26 PM F.L. Ratzkowski <flr28 at cam.ac.uk> wrote:
>
> Hi,
> first of all thank you for your quick responds. I indeed mean Seqrecord
> sorry for being not fully accurate. The SeqRecord is a whole E.coli genome
> with lots of features. and the changes to the sequence I am making are due
> to random mutation means that they also occur in features and thereby
> affect the start and end position differently as well as features
> overlapping, thus i feel like the shifting or chunking method might not
> work. Or if so could u please provide more detail
>
> You stating 'both lists CompareBefore
> and CompareAfter will be the same list. Thus you'd find no differences. '
> could u elaborate, my understanding of the featurelocation command was that
> it would overwrite the previous feature.location.
>
> Thanks again for taking the time.
> Best wishes,
> Franz
>
> Franz L. Böge
> PhD Student, University of Cambridge
> Chin Group, MRC Laboratory of Molecular Biology
> Phone: +44 (0)1223 267604
>
>
>
>
> ------------------------------
> *From:* Peter Cock <p.j.a.cock at googlemail.com>
> *Sent:* 11 November 2020 19:40
> *To:* F.L. Ratzkowski <flr28 at cam.ac.uk>
> *Cc:* biopython at mailman.open-bio.org <biopython at mailman.open-bio.org>
> *Subject:* Re: [Biopython] Updating feature Location in Seqfiles
>
> Hello,
>
> First what may be the simple answer: Looking at that code snippet, and
> guessing that Seqfile is a SeqRecord object, both lists CompareBefore
> and CompareAfter will be the same list. Thus you'd find no differences.
>
> More generally, I would have recommend using slicing and addition of
> SeqRecord chunks, which will reserve your features and shift their
> coordinates accordingly - as long as each feature is fully within a chunk.
> The tutorial covers this.
>
> If you want to work more directly, you could try shifting a
> feature location
> with an offset by adding (or subtracting an integer). (You could do
> something
> similar with the (private) _shift methods, but that is not intended for
> direct use.)
>
> However, it looks like you're taking what I would consider to be the
> hardest
> option of rebuilding new location objects after all your sequence editing.
> This is only reasonably straightforward if you have just simple locations
> (like most bacterial features). Joins, origin wrapping, fuzzy locations etc
> would need additional work.
>
> Peter
>
> On Wed, Nov 11, 2020 at 5:26 PM F.L. Ratzkowski <flr28 at cam.ac.uk> wrote:
>
> Hi,
> i am new here so I am sorry if I am doing anything wrong, feel free to
> tell me.
> I am farely new to biopython as well, currently i have written a script
> that reads in a .gbk file and then deletes and inserts different parts of
> the sequence. this would alter the absolute position of features. I have
> ultimately run a script that safes the new positions in 2 dataframes. When
> i now want to update all features with the Featurelocation command it seems
> to not work. I saved the features in a variable before and after me
> updating the features and was hoping to then detect a difference but it
> fails. here the basic part of the script i am struggling with
>
> CompareBefore = Seqfile.features
>
> for z in range(0, len(featureEndList)):
>     feature_end = int(ELdf.iloc[z][1])
>     feature_start = int(SLdf.iloc[z][1])
>     feature_strand = Seqfile.features[z].location.strand
>     Seqfile.features[z].location = FeatureLocation(feature_start,
> feature_end,strand=feature_strand)
>
> CompareAfter = Seqfile.features
>
> list = []
> for x in range(0,len(CompareAfter)):
>     if CompareBefore[x] != CompareAfter[x]:
>         list.append(x)
>
> I get 0 hits in this list.
> Please help 🙂 thank you
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> https://mailman.open-bio.org/mailman/listinfo/biopython
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20201111/719b40c2/attachment.htm>


More information about the Biopython mailing list