[BioPython] can biopython query KEGG directly?

Thu Mar 12 14:15:06 UTC 2009

On Thu, Mar 12, 2009 at 12:33 PM, Giovanni Marco Dall'Olio
<dalloliogm at gmail.com> wrote:
>> We still need a Bio.KEGG gene parser, see also:
>> http://bioperl.org/wiki/KEGG_sequence_format
>> http://lists.open-bio.org/pipermail/biopython/2008-January/004000.html
>> Once that is done, a KEGG wrapper in Bio.SeqIO would make sense.
>
> I am just curious, but in which object a Kegg gene file would be transposed?
> A SeqRecord? And how, exactly? I suppose all the features will go in
> SeqRecord.features... but is there any standard convention to do so?
> For example, the codon usage table, class, dblinks, and all the other
> fields.. how they would be stored?

Bio.SeqIO only deals with SeqRecord objects.  If we had a KEGG gene
parser in Bio.KEGG (written in the same style as the rest of Bio.KEGG
ideally), then it would make sense to add a KEGG gene format to
Bio.SeqIO, where the KEGG gene records would be parsed using Bio.KEGG
and then converted into SeqRecord objects.  At a minimum this would
mean their id/name/description and sequence - even just that would
still be useful I feel.  For any richer annotation, the convention is
to mimic the GenBank parser as closely as possible.  See
http://biopython.org/wiki/SeqIO_dev

Peter