[Biopython] Working with genomic intervals

Laurent Gautier lgautier at gmail.com
Mon Aug 15 06:17:28 UTC 2011


On 2011-08-14 18:00, biopython-request at lists.open-bio.org wrote:
> On Sun, Aug 14, 2011 at 7:11 AM, Peter Cock<p.j.a.cock at googlemail.com>  wrote:
>> >  On Friday, August 12, 2011, Aaron Quinlan<aaronquinlan at gmail.com>  wrote:
>>> >>  All,
>>> >>
>>> >>  I apologize in advance if this is a naive question.
>>> >>  I am wondering if BioPython provides libraries for
>>> >>  working with genomic intervals in BED, GFF, or
>>> >>  any other like format? ?I am looking for libraries
>>> >>  that handle the parsing of files in these formats
>>> >>  into Python objects, as well as libraries for
>>> >>  manipulating (intersection, merging, counting,
>>> >>  etc.) intervals. ?I know this exists in Galaxy's
>>> >>  bx-python, but am wondering if there are similar
>>> >>  libraries in BioPython?
>>> >>
>>> >>  Gratefully,
>>> >>  Aaron
>> >
>> >  Hi Aaron,
>> >
>> >  Have a look athttp://biopython.org/wiki/GFF_Parsing
>> >  wher Brad is working on this. He's also spoken
>> >  highly of bx-python as I recall.
> I would second the bx-python vote.  Not only are the "normal" interval
> classes covered, but there are also some variants (clustering is one
> that comes to mind).
>
> Sean

One can also access from Python the utilities for ranges available in
bioconductor, for example using the bioconductor extension to rpy2 or rpy2
directly (may be using dynamic class mapping features, as shown below):

from rpy2.robjects.packages import importr
iranges = importr("IRanges")
# Python class IRanges as an API to Bioconductors IRanges::IRanges
from rpy2.robjects.methods import RS4, RS4Auto_Type
class IRanges(RS4):
     __metaclass__ = RS4Auto_Type
     __rpackagename__ = "IRanges"
     __rname__ = "IRanges"

# now in action

 >>> from rpy2.robjects.vectors import IntVector
 >>> ir = IRanges(iranges.IRanges(start = IntVector(range(10)), width = 11))
 >>> print(ir)
IRanges of length 10
      start end width
[1]      0  10    11
[2]      1  11    11
[3]      2  12    11
[4]      3  13    11
[5]      4  14    11
[6]      5  15    11
[7]      6  16    11
[8]      7  17    11
[9]      8  18    11
[10]     9  19    11
 >>> print(IRanges(ir.reduce__IRanges(ir)))
IRanges of length 1
     start end width
[1]     0  19    20





More information about the Biopython mailing list