[Open-bio-l] eyeballs needed -- my biosql install diary

KATAYAMA Toshiaki katayama@kuicr.kyoto-u.ac.jp
Sun, 19 May 2002 01:30:18 +0900


Hi,

Is there any size limitation on biosequence?

I tried to load recent RefSeq into BioSQL, I've got following errors:


-----X8-----X8-----
Reading ../refseq/rscu.gbff
(..snip..)
DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 1662456.
DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 1662456.
-----X8-----X8-----

The line 1662456 of my RefSeq file was the entry NC_000918, which was
A. aeolicus genome with length 1551335 bp.  I have already loaded
part of GenBank (gbvrl*) and Swissprot (thanks to Chris's doc :-) on
my BioSQL server, however, the maximum length in biosql at that time
was around 368k bp.


I also want to know how BioSQL stores over the 16MB sequence entry
(i.e. Arabidopsis chromosome in RefSeq) into biosequence table
with MySQL's mediumtext (L < 2^24).

My silly approach other than BioSQL to store GenBank/RefSeq on MySQL was
  http://bioruby.org/cgi-bin/cvs/reviz/bioruby/sample/
gb2tab.rb and gbtab2mysql.rb (used for http://gb.bioruby.org/),
in this case, I have splitted long sequence into pieces with numbers.

Furthermore, MySQL's longtext seems long enough, however it didn't
work well when I tried. (I forgot details but, packet size limitation
error or something was occured, my configuration problem?)


At Wed, 15 May 2002 19:11:08 -0400,
Chris Dagdigian wrote:
> http://bioteam.net/dag/BioTeam-HOWTO-1-BIOSQL.html

Your document seems very useful.  From this doc:

>> Step 10 - What next?
>>
>> Figure out how to export/dump the database and see how quickly we
>> can recreate the database with these raw files instead of
>> laboriously using BioPerl to parse and load objects one at a
>> time. Loading the database is slow and it may be cool to package up
>> tab-delimited biosql exports so that others can load their own
>> databases much faster.

Cool. If there were a repository of this format.


Regards,
Toshiaki Katayama
--
Kanehisa laboratory (Bioinformatics Center)
Institute for Chemical Research, Kyoto Univ.
Gokasho, Uji, Kyoto 611-0011, Japan
TEL +81 774 38 3272, FAX +81 774 38 3269
http://web.kuicr.kyoto-u.ac.jp/~katayama/
http://bioruby.org/ (k@bioruby.org)