[Biojava-dev] [Fwd: large genbank data]
Rey Vincent Babilonia
rvincent at asti.dost.gov.ph
Fri Jul 18 01:59:47 UTC 2008
Hi Mark,
At first it throws an out of memory exception. My workaround is to
subdivide the sequence file into individual GenBank files.
The error now is that if a GenBank sequence has an 'empty alphabet', it
does not get loaded to BioSQL. My workaround is to check if
sequence.getAlphabet().getName() is DNA.
Thanks.
Mark Schreiber wrote:
> Hi -
>
> Is the code throwing an exception or running out of memory??
>
> Can you send an example program and the problem you encounter to the list.
> - Mark
>
> On Thu, May 29, 2008 at 9:53 AM, Rey Vincent Babilonia
> <rvincent at asti.dost.gov.ph> wrote:
>>
>> -------- Original Message --------
>> Subject: large genbank data
>> Date: Wed, 28 May 2008 18:02:48 +0800
>> From: Rey Vincent Babilonia <rvincent at asti.dost.gov.ph>
>> To: biojava-l at biojava.org
>>
>> hi,
>>
>> anybody tried uploading a large genbank data (e.g.
>> ftp://bio-mirror.net/biomirror/genbank/gbbct1.seq.gz) to biosql?
>> load_seqdatabase.pl of bioperl can do this. i'm switching to biojava and
>> it can't read the sequence (maybe because it has 30000+ sequences).
>>
>> thanks.
>>
>> --
>> /**
>> * @author Rey Vincent P. Babilonia
>> * @number +63 2 426 9760 local 1302
>> * @pgp 0x383454CF <at> pgp.mit.edu
>> * @project Philippine Bioinformatics Solutions
>> * @program Philippine e-Science Grid
>> * @division Research and Development Division
>> * @agency Advanced Science and Technology Institute
>> * @url http://www.psigrid.gov.ph
>> */
>>
>>
>> --
>> /**
>> * @author Rey Vincent P. Babilonia
>> * @number +63 2 426 9760 local 1302
>> * @pgp 0x383454CF <at> pgp.mit.edu
>> * @project Philippine Bioinformatics Solutions
>> * @program Philippine e-Science Grid
>> * @division Research and Development Division
>> * @agency Advanced Science and Technology Institute
>> * @url http://www.psigrid.gov.ph
>> */
>>
>> No virus found in this outgoing message.
>> Checked by AVG.
>> Version: 8.0.100 / Virus Database: 269.24.2/1471 - Release Date: 5/28/2008 5:33 PM
>>
>> _______________________________________________
>> biojava-dev mailing list
>> biojava-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biojava-dev
>>
>
--
/**
* @author Rey Vincent P. Babilonia
* @number +63 2 426 9760 local 1302
* @pgp 0x383454CF <at> pgp.mit.edu
* @project Philippine Bioinformatics Solutions
* @program Philippine e-Science Grid
* @division Research and Development Division
* @agency Advanced Science and Technology Institute
* @url http://www.psigrid.gov.ph
*/
More information about the biojava-dev
mailing list