[Biopython-dev] [Bug 3060] Add ungap method to the SeqRecord?

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Tue Jun 22 13:11:15 UTC 2010


http://bugzilla.open-bio.org/show_bug.cgi?id=3060





------- Comment #3 from biopython-bugzilla at maubp.freeserve.co.uk  2010-06-22 09:11 EST -------
(In reply to comment #0)
> My motivating example is to take an ACE file loaded with SeqIO, remove the
> gaps, and output the contigs as FASTQ or QUAL files. This requires the
> per-letter-annotation to be sliced to match the ungapped sequence.
> 
> Likewise any features fully contained within ungapped regions should be
> retained and their co-ordinates shifted. I'm not sure if we should do anything
> about features spanning a gap - the simple option which I have implemented is
> they are lost. This is done via the existing SeqRecord slicing and addition
> code.

I've been trying building SeqFeature objects for the reads in an ACE file,
http://github.com/peterjc/biopython/tree/ace-reads

In this case when I call the SeqRecord ungap method, many of my read features
are lost with the current implementation (because they included gaps). This
also showed the ungap code to be quite slow for features. I'm going to have
another look at this.

Peter


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list