[Biojava-l] [Biojava-dev] [Fwd: large genbank data]

Rey Vincent Babilonia rvincent at asti.dost.gov.ph
Fri Jul 18 01:59:47 UTC 2008


Hi Mark,

At first it throws an out of memory exception. My workaround is to 
subdivide the sequence file into individual GenBank files.

The error now is that if a GenBank sequence has an 'empty alphabet', it 
does not get loaded to BioSQL. My workaround is to check if 
sequence.getAlphabet().getName() is DNA.

Thanks.

Mark Schreiber wrote:
> Hi -
> 
> Is the code throwing an exception or running out of memory??
> 
> Can you send an example program and the problem you encounter to the list.
> - Mark
> 
> On Thu, May 29, 2008 at 9:53 AM, Rey Vincent Babilonia
> <rvincent at asti.dost.gov.ph> wrote:
>>
>> -------- Original Message --------
>> Subject: large genbank data
>> Date: Wed, 28 May 2008 18:02:48 +0800
>> From: Rey Vincent Babilonia <rvincent at asti.dost.gov.ph>
>> To: biojava-l at biojava.org
>>
>> hi,
>>
>> anybody tried uploading a large genbank data (e.g.
>> ftp://bio-mirror.net/biomirror/genbank/gbbct1.seq.gz) to biosql?
>> load_seqdatabase.pl of bioperl can do this. i'm switching to biojava and
>> it can't read the sequence (maybe because it has 30000+ sequences).
>>
>> thanks.
>>
>> --
>> /**
>>  * @author   Rey Vincent P. Babilonia
>>  * @number   +63 2 426 9760 local 1302
>>  * @pgp      0x383454CF <at> pgp.mit.edu
>>  * @project  Philippine Bioinformatics Solutions
>>  * @program  Philippine e-Science Grid
>>  * @division Research and Development Division
>>  * @agency   Advanced Science and Technology Institute
>>  * @url      http://www.psigrid.gov.ph
>>  */
>>
>>
>> --
>> /**
>>  * @author   Rey Vincent P. Babilonia
>>  * @number   +63 2 426 9760 local 1302
>>  * @pgp      0x383454CF <at> pgp.mit.edu
>>  * @project  Philippine Bioinformatics Solutions
>>  * @program  Philippine e-Science Grid
>>  * @division Research and Development Division
>>  * @agency   Advanced Science and Technology Institute
>>  * @url      http://www.psigrid.gov.ph
>>  */
>>
>> No virus found in this outgoing message.
>> Checked by AVG.
>> Version: 8.0.100 / Virus Database: 269.24.2/1471 - Release Date: 5/28/2008 5:33 PM
>>
>> _______________________________________________
>> biojava-dev mailing list
>> biojava-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
> 

-- 
/**
  * @author   Rey Vincent P. Babilonia
  * @number   +63 2 426 9760 local 1302
  * @pgp      0x383454CF <at> pgp.mit.edu
  * @project  Philippine Bioinformatics Solutions
  * @program  Philippine e-Science Grid
  * @division Research and Development Division
  * @agency   Advanced Science and Technology Institute
  * @url      http://www.psigrid.gov.ph
  */




More information about the Biojava-l mailing list