[Bioperl-l] BioSQL: loading large sequence records,
and taxon parsing
Hilmar Lapp
hlapp at gnf.org
Fri Jun 20 14:30:34 EDT 2003
> >
> We will try to make our full BioSQL dumps available soon, let me know
> if you want to have them.
>
That would be very useful. Remember at the hackathon we said that at
some point we'd like to dump a bioperl-db generated load and reload into
a biojava-managed instance and see how things look then.
Although I guess the biojava folks want a Postgres dump for that.
-hilmar
> Elia
>
>
> >
> >>
> >> 3. The problem I encountered that may be related to how the
> >> taxon_name table is
> >> populated by the load_seqdatabase.pl (or modules called
> by). I loaded
> >> the
> >> database with 2 organelle genomes the mito and the
> chloroplast with
> >> following
> >> two records in that order. Though both records show up in the
> >> bioentry table,
> >> it seems only the info from the first record got populated
> into the
> >> taxon_name
> >> table:
> >>
> >> taxon_id | name | name_class
> >> ----------+------------------------------------+-----------------
> >> 1 | Eukaryota | scientific name
> >> 2 | Viridiplantae | scientific name
> >> .......... extra lines removed ...................
> >> 13 | Brassicaceae | scientific name
> >> 14 | Arabidopsis | scientific name
> >> 15 | Mitochondrion | scientific name
> >> 16 | Mitochondrion Arabidopsis | scientific name
> >> 17 | Mitochondrion Arabidopsis thaliana | scientific name
> >> 17 | thale cress | common name
> >> (18 rows)
> >
> > To be honest, I do not care about it, as long as you can fetch the
> > result out correctly. I actually met such case before. One way to
> > solve it is to load_ncbi_taxonomy before load your
> sequence. (That may
> > be unnecessary in your case)
> >
> > A user-to-user talk. :-)
> >
> > Juguang
> >
> > ------------ATGCCGAGCTTNNNNCT--------------
> > Juguang Xiao
> > Temasek Life Sciences Laboratory, National University of
> Singapore 1
> > Research Link, Singapore 117604 juguang at tll.org.sg
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> ---
> Bioinformatics Program Manager
> Temasek Life Sciences Laboratory
> 1, Research Link
> Singapore 117604
> Tel. +65 6874 4945
> Fax. +65 6872 7007
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-> bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list