[BioSQL-l] Treating GenBank source features as top level annotation

Peter biopython at maubp.freeserve.co.uk
Wed Nov 18 12:27:12 UTC 2009


On Wed, Nov 18, 2009 at 12:08 PM, Richard Holland
<holland at eaglegenomics.com> wrote:
>
> BioJava's latest parsers do the following:
> ...

Without checking all the details, that is broadly what Biopython does
at the moment.

> The main reason why we still use the source feature and don't go to sequence
> level is because when converting between formats it's hard to tell which
> sequence-level qualifier_values are from the source feature and which are
> from other places.

Makes sense.

> The main reason why we rely entirely on the source feature for organism
> and taxon ID info is because it's much easier to parse than the SOURCE
> and ORGANISM tags.

>From memory, Biopython also uses the taxon table here too.

Peter



More information about the BioSQL-l mailing list