[BioPython] BioSQL.DBSeqRecord._get_keywords() / working with BioSQL

Ramu Chenna chenna@embl-heidelberg.de
Wed, 29 May 2002 17:15:09 +0200


On Wed, 29 May 2002, Brad Chapman wrote:

> Hi Murple;
>
> > What is your opinion, is biopython better then biojava or bioperl?
> > Java would be as find as Python for me, Perl i don't speak and have no
> > intent to learn.
>
> I think Jason answered this better than I could've. Personally, I think
> biopython is best, but I'm a little biased :-).
>
> > How many people are involved in biopython? What can I do to support?
> > (I have good programming background but lack some understanding of the
> > biology part. Trying to learn.)
>
> We are always looking for people to help with coding. The biggest help
> is always just to delve into the code, find something you think is
> lacking, and add it.
>
> > At the moment I think I don't know enough about bio* to write
> > documentation and also I'm not sure if I will stick with it. Just
> > evaluating if it fits my needs. But I agree that documentation is
> > lacking. Maybe I can help by asking wrong questions so you find out what
> > documention is missing.
>
> Yup, I've been feeling very sheepish about the docs with everyone
> tearing on 'em these past few days. I wish I had more time to work on
> this, as I actually like writing documentation and think it's very
> important.
>
> I'm going to try to do some BioSQL documentation since there is so much
> interest in it. Hopefully I'll be able to crank some out rather quickly.
>
> > Here is another of these questions: While parsing a GenBank entry, is
> > there any information loss? Would it be possible to "unparse" a
> > SeqRecord back into a flatfile? Is there already code for this in
> > biopython? If not and I want to write such think, where are the best
> > places to start?
>
> There is definitely information loss parsing into a SeqRecord object.
> Right now the biopython code is very good at representing the
> information in feature tables of GenBank files, but not so good at
> representing the "meta-information" in these files (ie. keywords,
> organism stuff, references, comments, etc). We definitely could use some
> general classes to represent this sort of information.
>
> With regards to writing things back out in various formats, this is
> something that Andrew is working on with his BioFormats stuff. The
> rudiments of this are in Biopython right now and are starting to get
> going, but I don't think it's finished. Andrew can probably tell you
> more about the current status, but the ideas are there for a nice
> system.

I have written couple of  modules including for writing sequence in to
various formats, currently it supports the following formats


- fmt_gcg
- fmt_genbank
- fmt_fasta
- fmt_staden
- fmt_raw
- fmt_pir
- fmt_asn
- fmt_nbrf
- fmt_embl
- fmt_swiss
- fmt_ig
- fmt_stdpir

you can find it at

    http://www.embl-heidelberg.de/~chenna/PySAT/PySAT.tar.gz

Ramu