[BioSQL-l] What should source_term_id in table seqfeature refer to?

Florian Mittag florian.mittag at uni-tuebingen.de
Tue Aug 11 09:09:50 UTC 2009


Hm, I should've mentioned my real concern. We're integrating all kinds of data 
into the database and right now I want to import miRNA information (sequences 
and target sites) from miRBase (http://microrna.sanger.ac.uk/sequences/).
The files I download from there specify "miRanda" as METHOD, so should I use 
this as source term or miRBase?

Thanks,
- Florian

On Tuesday, 11. August 2009 10:59, Richard Holland wrote:
> The reason BJX does that is because the Genbank format has no
> indication of where a feature came from. So, all there is to go on is
> that it came from Genbank! This allows us to differentiate between
> features on a sequence that were loaded from an original file, and new
> features that have been added to the sequence in the db after it was
> loaded (e.g. by running blast, blat etc. against some local data).
>
> On 11 Aug 2009, at 09:10, Florian Mittag wrote:
> > Hi!
> >
> > I stumbled upon an old post from Hilmar:
> >
> > On Tue, 18 Mar 2003, Hilmar Lapp wrote:
> >> type_term_id is supposed to reference an SO term. source is
> >> supposed to
> >> denote the 'method'  (BLAST, BLAT, sim4, genewise, whatnot), as far
> >> as
> >> my understanding goes. In the case of reading the features from a
> >> GenBank feature table, assigning 'Genbank/EMBL/Swissprot' as the
> >> source
> >> (which is what the genbank, embl, and swissprot parsers do in
> >> bioperl)
> >> is maybe stretching the definition, but I don't have something
> >> substantially better to offer.
> >
> > I inspected the database after I imported some Genbank files with
> > BioJava, and
> > I found that the source_term_id for the seqfeatures is always set to
> > the ID
> > of an automatically inserted term "Genbank" with definition "auto-
> > generated
> > by biojavax".
> >
> > I was wondering if there is anything new to the source_term_id.
> >
> > - Florian
> > _______________________________________________
> > BioSQL-l mailing list
> > BioSQL-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/biosql-l
>
> --
> Richard Holland, BSc MBCS
> Operations and Delivery Director, Eagle Genomics Ltd
> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
> http://www.eaglegenomics.com/

-- 
Dipl. Inf. Florian Mittag
Universität Tuebingen
WSI-RA, Sand 1
72076 Tuebingen, Germany
Phone: +49 7071 / 29 78985  Fax: +49 7071 / 29 5091




More information about the BioSQL-l mailing list