[Biojava-l] Impossible to catch Location parsing Error

Cox, Greg gcox@netgenics.com
Tue, 24 Sep 2002 13:07:37 -0400


With 2-4, the version I looked at didn't have the features that couldn't be
parsed.  

With 1, the problem is that it's a circular sequence, and 134545^1 wraps
around.  At the boot camp two years ago, I think we decided the right thing
to do is check the flag, and promote it to 134545^(sequencelength + 1).  My
memory is a little hazy though, could someone check me?

The reason you aren't seeing an error is that this isn't one.  Kind of.  A
biojava sequence is returned from case 1, and it includes all the features
except the one it couldn't handle.  An error isn't thrown because that would
halt execution, and the parser can function to some degree.  The philosophy
behind this choice was that even though biojava can't handle the entire
sequence, you might be able to work with the part it could parse.  If it's
important to you to provide all or nothing, I believe registering a listener
with the parser that throws an exception when anything happens will prevent
the sequence from being created.

Greg

-----Original Message-----
From: Hiroyuki Hashimoto [mailto:hirohash@genes.nig.ac.jp]
Sent: Monday, September 23, 2002 8:46 PM
To: biojava-l@biojava.org
Subject: Re: [Biojava-l] Impossible to catch Location parsing Error


Hi, Matthew.

Thank you for a reply.
I am sorry that my reply is overdue.

It is as follows that I checked;

(1) AB042240    misc_feature    134545^1
(2) AB046436    misc_feature    92619..88345
(3) AB033993    misc_feature    2636..2382
(4) AB047280    repeat_unit     11648..1676

And I got them in GenBank format, their features is deleted, except (1).
If you want to DDBJ Format, please access URL below;
http://getentry.ddbj.nig.ac.jp/getstart-e.html

Probably, you turn out that DDBJ format is almost the same as GenBank 
except for the point of GI Number. 

Regards, 

Hiroyuki

----- Original Message ----- 
From: "Matthew Pocock" <matthew_pocock@yahoo.co.uk>
To: "Hiroyuki Hashimoto" <hirohash@genes.nig.ac.jp>
Cc: <biojava-l@biojava.org>
Sent: Friday, September 20, 2002 8:17 PM
Subject: Re: [Biojava-l] Impossible to catch Location parsing Error


> Hi Hiroyuki,
> 
> The genbank parser was not designed with DDBJ in mind, so I am not
> supprised that it fails. I am suprised that you have feataures with
> backwards coordinates - I thought that genbank, embl and DDBJ had agreed
> to use the exact same feature table format, but I may be wrong here. Do
> you have the accession number for an entry that fails?
> 
> The short answer is to grab the corresponding embl or genbank entry. The
> better solution is to double-check that you have a legal entry, and if
> you do, we can make a copy of the genbank parser called DDBJFormat and
> add in the extra code necisary to handle your backwards features.
> 
> Best,
> 
> Matthew
> 
> Hiroyuki Hashimoto wrote:
> > Hello, everyone.
> > 
> > I mail BioJava-ML for the first time.
> > 
> > I'm trying to parse DDBJ Format data, which is similar to
GenBank-Format.
> > 
> > The problem arose.
> > DDBJ's data contain illegal? Location of Feature, so the number of start
> > position is greater than that of end position, for example,
"92619..88345".
> > BioJava's parser read out stderr message;
> > "This line could not be parsed: misc_feature    92619..88345"
> > But not throw Exception, so it is impossible to catch this error..
> > And it is ignored after that.
> > 
> > They are being caught if it carries out what, or a bug?
> > 
> > 
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> > 
> 
> 
> -- 
> BioJava Consulting LTD - Support and training for BioJava
> http://www.biojava.co.uk
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l