[Open-bio-l] eyeballs needed -- my biosql install diary

KATAYAMA Toshiaki katayama@kuicr.kyoto-u.ac.jp
Thu, 30 May 2002 19:18:49 +0900


Hello,

Chris, thank you for your information!

I have changed max_allowed_packet (and sort_buffer, record_buffer) size
to 16MB, and almost all RefSeq entries could be successfully loaded
into BioSQL, except for some entries containing >16MB sequence
(i.e. NC_003282 - C. elegans chromosome IV, 17484798 bp).

Then, I increased these parameter size upto 20MB, however this entry
could not be loaded.

-----X8-----X8-----
% perl ./load_seqdatabase.pl -host localhost -sqldb biosql -dbuser root -format genbank rs NC_003282
Reading NC_003282
DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 322491.
DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 322491.
-----X8-----X8-----

I also tried changing biosequence_str from mediumtext to longtext,
but the same error still occured.  Hmm..

Regards,
Toshiaki Katayama
--
Kanehisa laboratory (Bioinformatics Center)
Institute for Chemical Research, Kyoto Univ.
Gokasho, Uji, Kyoto 611-0011, Japan
TEL +81 774 38 3272, FAX +81 774 38 3269
http://web.kuicr.kyoto-u.ac.jp/~katayama/
http://bioruby.org/ (k@bioruby.org)


At Wed, 29 May 2002 16:45:04 -0400,
Chris Dagdigian wrote:
> 
> 
> Hello,
> 
> Just in case you have not solved this yet it seems that you may need to 
> alter the MySQL configuration value "max_allowed_packet" to be a fairly 
> large number in order to handle very large sequence objects.
> 
> Keith Allen reported this on bioperl-l; the specific message is online 
> at http://bioperl.org/pipermail/bioperl-l/2002-May/007987.html
> 
> Hope this helps!
> 
> I've incorporated several people's comments into my BioSQL diary and 
> will be putting an updated version online shortly.
> 
> Regards,
> Chris
> 
> 
> KATAYAMA Toshiaki wrote:
> > Hi,
> > 
> > Is there any size limitation on biosequence?
> > 
> > I tried to load recent RefSeq into BioSQL, I've got following errors:
> > 
> > 
> > -----X8-----X8-----
> > Reading ../refseq/rscu.gbff
> > (..snip..)
> > DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 1662456.
> > DBD::mysql::st execute failed: MySQL server has gone away at /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/SQL/PrimarySeqAdaptor.pm line 130, <GEN0> line 1662456.
> > -----X8-----X8-----
> > 
> > The line 1662456 of my RefSeq file was the entry NC_000918, which was
> > A. aeolicus genome with length 1551335 bp.  I have already loaded
> > part of GenBank (gbvrl*) and Swissprot (thanks to Chris's doc :-) on
> > my BioSQL server, however, the maximum length in biosql at that time
> > was around 368k bp.
> > 
> > 
> > I also want to know how BioSQL stores over the 16MB sequence entry
> > (i.e. Arabidopsis chromosome in RefSeq) into biosequence table
> > with MySQL's mediumtext (L < 2^24).
> > 
> > My silly approach other than BioSQL to store GenBank/RefSeq on MySQL was
> >   http://bioruby.org/cgi-bin/cvs/reviz/bioruby/sample/
> > gb2tab.rb and gbtab2mysql.rb (used for http://gb.bioruby.org/),
> > in this case, I have splitted long sequence into pieces with numbers.
> > 
> > Furthermore, MySQL's longtext seems long enough, however it didn't
> > work well when I tried. (I forgot details but, packet size limitation
> > error or something was occured, my configuration problem?)
> > 
> > 
> > At Wed, 15 May 2002 19:11:08 -0400,
> > Chris Dagdigian wrote:
> > 
> >>http://bioteam.net/dag/BioTeam-HOWTO-1-BIOSQL.html
> > 
> > 
> > Your document seems very useful.  From this doc:
> > 
> > 
> >>>Step 10 - What next?
> >>>
> >>>Figure out how to export/dump the database and see how quickly we
> >>>can recreate the database with these raw files instead of
> >>>laboriously using BioPerl to parse and load objects one at a
> >>>time. Loading the database is slow and it may be cool to package up
> >>>tab-delimited biosql exports so that others can load their own
> >>>databases much faster.
> >>
> > 
> > Cool. If there were a repository of this format.
> > 
> > 
> > Regards,
> > Toshiaki Katayama
> > --
> > Kanehisa laboratory (Bioinformatics Center)
> > Institute for Chemical Research, Kyoto Univ.
> > Gokasho, Uji, Kyoto 611-0011, Japan
> > TEL +81 774 38 3272, FAX +81 774 38 3269
> > http://web.kuicr.kyoto-u.ac.jp/~katayama/
> > http://bioruby.org/ (k@bioruby.org)
> > _______________________________________________
> > Open-Bio-l mailing list
> > Open-Bio-l@open-bio.org
> > http://open-bio.org/mailman/listinfo/open-bio-l
> 
> 
> -- 
> Chris Dagdigian, <dag@sonsorol.org>
> Life Science IT & Research Computing Freelancer
> Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193
> Yahoo IM: craffi
>