[Biopython] arrays in Biopython

Michiel de Hoon mjldehoon at yahoo.com
Sun Jan 12 12:56:01 UTC 2020


Dear all,
Currently there are four classes in Biopython that model an array where the letters can be used as indices:

Bio.Align.substitution_matrices: Array class 
Bio.Align.AlignInfo: PSSM class
Bio.Phylo.TreeConstruction: _Matrix class
Bio.motifs.matrix: GenericPositionMatrix

(and the FreqTable class in Bio.SubsMat.FreqTable is similar).

For example, the Array class in Bio.Align.substitution_matrices allows you to do things like>>> from Bio.Align.substitution_matrices import Array>>> a = Array("ACGT", dims=2)
>>> a
Array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]],
         alphabet='ACGT')

>>> a['C','A'] = 6>>> a
Array([[0., 0., 0., 0.],
       [6., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]],
         alphabet='ACGT')
>>> sum(a['C'])
6.0
>>> a[3,'G'] = 1 >>> a['A',:] = 4
>>> a
Array([[4., 4., 4., 4.],
       [6., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 1., 0.]],         alphabet='ACGT')>>> sum(a[:, 'A'])
10.0>>> >>> from numpy import sin>>> sin(a) Array([[-0.7568025 , -0.7568025 , -0.7568025 , -0.7568025 ],
       [-0.2794155 ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.84147098,  0.        ]],
         alphabet='ACGT')



This class was implemented as a subclass of a numpy array. This has the big advantage that the array acts as a numpy array (e.g. you can apply numpy functions to it and get back an array of the same class, as in the example above), but unfortunately subclassing numpy arrays is not easy (see the code in Bio.Align.substitution_matrices).
Would it then make sense to make this class available as a general-purpose array class where strings can be used as indices?For example, inside a new Bio.math module.
Other modules in Biopython could then either make use of this class directly, or subclass it if needed.
Thanks,-Michiel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython/attachments/20200112/d4c7cdf1/attachment.htm>


More information about the Biopython mailing list