[Biopython-dev] New Bio.SeqIO code

Peter (BioPython Dev) biopython-dev at maubp.freeserve.co.uk
Fri Oct 27 19:08:21 UTC 2006


Hello list,

I've checked in a somewhat cleaned up (and more tested) version of the
earlier attachments to bug 2059.

And I've updated the wiki page:
http://biopython.org/wiki/SeqIO

Has anyone got any tips on formatting python code on Wiki?  Maybe I
should just write the docs in LaTeX like the cook book etc.

Can I check in bug 2057 too?  Given the SeqIO system produces SeqRecord
objects, it would be a good idea to make them slightly more user-friendly:

http://bugzilla.open-bio.org/show_bug.cgi?id=2057

(I would like to check this in before writing to much of the SeqIO
documentation)

If any of you want to check this out and have a look, I'd be pleased to
get some feedback.

There should be no impact on the rest of BioPython, or existing scripts.

Peter
-----------------------------------------------------------------

Link to view CVS,

http://cvs.biopython.org/cgi-bin/viewcvs/viewcvs.cgi/biopython/Bio/SeqIO/?cvsroot=biopython

Old files, not touched:
Bio/SeqIO/FASTA.py
Bio/SeqIO/generic.py

Bio/SeqIO/__init__.py (replaces almost empty old file)
======================
* the helper functions (i.e. the functions I expect people to use)
* mappings from file types to parsers and writers
* mappings from file extensions to file types
* large self test suite (which does not need any input files, but will
create a temp file in the current directory)

Bio/SeqIO/Interfaces.py
=======================
Base classes for readers/writers

Bio/SeqIO/FastaIO.py
====================
Uses a generator function for the reader.
Uses a sub-class of SequentialSequenceWriter for the writer.

Bio/SeqIO/ClustalIO.py
======================
Uses a generator function for the reader, based on the old class in
Bio/SeqIO/generic.py

Bio/SeqIO/PhylipIO.py
=====================
Reads and writes phylip files with names strictly truncated at 10
characters.
Uses a generator function for the reader, subclasses SequenceWriter

Bio/SeqIO/StockholmIO.py
========================
Uses subclasses from Interfaces.py

Unlike prior code attached to bug 2059, this code contains just one
writer and parser, which expects the Stockholm file to follow the PFAM
conventions. It should read other files fine - but what happens to the
annotation is less well defined.  This is what BioPerl does

http://bugzilla.open-bio.org/show_bug.cgi?id=2059#c10

Bio/SeqIO/GenBankIO.py
======================
Uses a generator function for the reader, which just calls Bio.GenBank
to do the work.  See also bug 2059 comment 11 on my thoughts about how
to include EMBL support:

http://bugzilla.open-bio.org/show_bug.cgi?id=2059#c11

Bio/SeqIO/NexusIO.py
====================
Uses a generator function for the reader, which just calls Bio.Nexus to
do the parsing and then extracts the sequences.  Has not been tested much.

Peter





More information about the Biopython-dev mailing list