[Bioperl-l] RE: [Biojava-l] Writing genbank files
Dickson, Mike
mdickson@netgenics.com
Fri, 18 Jan 2002 18:55:35 -0500
The locus line format did change recently. See the NCBI site for details. I
thought a patch made it into the BioJava code for this. Which version of
BioJava are you using?
Mike
> -----Original Message-----
> From: David Waring [mailto:dwaring@u.washington.edu]
> Sent: Friday, January 18, 2002 5:54 PM
> To: Bioperl; biojava
> Subject: [Biojava-l] Writing genbank files
>
>
> I have come across a problem with genbank files using the perl module
> Bio::DB::GenBank. When I get the genbank sequence from NCBI
> and write the
> sequence out to in genbank format the Locus line is missing the date.
>
> LOCUS AC104722 24949 bp DNA linear BCT
>
> instead of
>
> LOCUS AC104722 24949 bp DNA linear BCT
> 21-DEC-2001
>
> which is what I get when I download the file myself. I don't
> know if this
> represents a problem in reading the reading the file or
> writing the file.
>
> Why am I cross-posting this to biojava???. Well the biojava
> parser dies on
> such a file with a message that says that the Locus line is too short.
>
> Is the date a required element in the Locus line? Is there
> consensus on what
> constitutes correct format? Has it changed recently?
>
> David
>
>
>
> I also noticed that the biojava parser is very picky about
> the number of
> spaces; delete a few spaces between DNA and linear and it dies too.
>
> Exception in thread "main"
> org.biojava.bio.seq.io.ParseException: LOCUS
> line too
> short [LOCUS AC104719 17453 bp DNA
> linear BCT
> 21-DE
> C-2001]
> at
> org.biojava.bio.seq.io.GenbankContext.parseLocusLinePost127(GenbankFo
> rmat.java, Compiled Code)
> at
> org.biojava.bio.seq.io.GenbankContext.processHeaderLine(GenbankFormat
> .java, Compiled Code)
> at
> org.biojava.bio.seq.io.GenbankContext.processLine(GenbankFormat.java,
> Compiled Code)
> at
> org.biojava.bio.seq.io.GenbankFormat.readSequence(GenbankFormat.java,
> Compiled Code)
> rethrown as org.biojava.bio.BioException: Could not
> read sequence
> at
> org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java, C
> ompiled Code)
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>