[Biojava-l] Re:Biojava-l digest, Vol 1 #132 - 2 msgs

Aaron Kitzmiller AKitzmiller@genetics.com
Sat, 19 Aug 2000 13:20:25 -0400


I'll be away from Cambridge until the 23rd of August.  You can reach me by voice mail at 617-665-6831

ajk

>>> "biojava-l@biojava.org" 08/19/00 12:00 >>>

Send Biojava-l mailing list submissions to
	biojava-l@biojava.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://biojava.org/mailman/listinfo/biojava-l
or, via email, send a message with subject or body 'help' to
	biojava-l-request@biojava.org

You can reach the person managing the list at
	biojava-l-admin@biojava.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Biojava-l digest..."


Today's Topics:

  1. Re:Biojava-l digest, Vol 1 #131 - 5 msgs (Aaron Kitzmiller)
  2. Re: build error w/ JBuilder (Ann Loraine)

--__--__--

Message: 1
Date: Fri, 18 Aug 2000 13:25:45 -0400
From: "Aaron Kitzmiller" <AKitzmiller@genetics.com>
Reply-To: AKitzmiller@genetics.com
To: biojava-l@biojava.org
Subject: [Biojava-l] Re:Biojava-l digest, Vol 1 #131 - 5 msgs

I'll be away from Cambridge until the 23rd of August.  You can reach me by voice mail at 617-665-6831

ajk

>>> "biojava-l@biojava.org" 08/18/00 12:00 >>>

Send Biojava-l mailing list submissions to
	biojava-l@biojava.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://biojava.org/mailman/listinfo/biojava-l
or, via email, send a message with subject or body 'help' to
	biojava-l-request@biojava.org

You can reach the person managing the list at
	biojava-l-admin@biojava.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Biojava-l digest..."


Today's Topics:

  1. Re: build error w/ JBuilder (Ann Loraine)
  2. xml.jar and xerces.jar (Ann Loraine)
  3. Re: build error w/ JBuilder (Thomas Down)
  4. FASTA reader problem: "Mark invalid" (Christian Gruber)
  5. Re: FASTA reader problem: "Mark invalid" (Thomas Down)

-- __--__-- 

Message: 1
Date: Thu, 17 Aug 2000 13:19:54 -0700 (PDT)
From: Ann Loraine <loraine@loraine.net>
To: Thomas Down <td2@sanger.ac.uk>
cc: biojava-l@biojava.org
Subject: Re: [Biojava-l] build error w/ JBuilder


I just updated my copy of biojava and picked up your change.

Sorry to be a pain, but now I have another problem -- again w/
building on Windows.  Has anyone else run into this, as well?

I'm not sure about this, but it seems that the xerces and Sun xml
jars are perhaps incompatible.  (the ones from biojava Web site)

When I try to build biojava I get this error:

"cannot access class com.sun.xml.tree.ElementNode; no source found; must
be compiled, because org.w3c.dom.Element.normalize referenced by
class com.sun.xml.tree.ElementNode has changed."

Thanks for your help!

-Ann

---

Ann E. Loraine
http://www.loraine.net

On Wed, 16 Aug 2000, Thomas Down wrote:

> On Sat, Aug 12, 2000 at 10:47:49AM -0700, Ann Loraine wrote:
> >
> > I'm having a bit of trouble building biojava w/ JBuilder,
> > my favorite Windows IDE.  
> 
> Hi...
> 
> I've done a bit of poking, and the code in EmblFormat
> and GenbankFormat is indeed illegal, at least given a strict
> reading of the inner classes spec.  I've shuffled things
> round in CVS (both on HEAD and release-1_0-branch).  This
> should solve the problem now...
> 
> If you can confirm that the latest CVS version is working,
> I'll put up a 1.01 release in a few days which includes
> this, and also the JDK1.3 compiler errors that were reported
> a week or two back.
> 
> Thanks,
>     Thomas.
> -- 
> One of the advantages of being disorderly is that one is
> constantly making exciting discoveries.
>                                        -- A. A. Milne
> 


-- __--__-- 

Message: 2
Date: Fri, 18 Aug 2000 00:42:18 -0700 (PDT)
From: Ann Loraine <loraine@loraine.net>
To: biojava-l@biojava.org
Subject: [Biojava-l] xml.jar and xerces.jar


The problem I reported earlier with building on Windows (w/ JBuilder)
is due to xml.jar and xerces.jar containing different versions of
the org.w3c.dom package.  (level 1 vs. level 2 ???)

For fun I got the source for com.sun.xml classes from Javasoft 
(http://java.sun.com/Download3) and tried building with
the xerces.jar file supplying org.w3c.dom classes.  The result
was a bunch of compile-time errors.

One was: "class com.sun.xml.tree.Doctype.EntityNode should be
declared abstract; it does not define method getLocalName()
in com.sun.xml.Nodebase."

I'm an xml newbie, so I looked at the source and found that
com.sun.xml.Nodebase is an abstract class which implements
org.w3c.dom.Node, which in the Javasoft distribution doesn't
promise a getLocalName() method.  The xerces version does promise
this method (http://xml.apache.org/apiDocs/org/w3c/dom/Node.html)
however and this is why I got the "should be declared abstract" build
error.

This method "getLocalName()" looks like it has something to do
with namespaces.  Is this right?

Anyhow it looks to me like Xerces distributes a more up-to-date
version of org.w3c.dom which is unfortunately not backwards compatible
with the earlier versions used by Javasoft's com.sun.xml packages.
That is, classes that implement interface(s) in the earlier
version will give an error if you try to use them with the
newer versions that promise more methods.

-Ann

---

Ann E. Loraine
http://www.loraine.net



-- __--__-- 

Message: 3
Date: Fri, 18 Aug 2000 10:58:30 +0100
From: Thomas Down <td2@sanger.ac.uk>
To: Ann Loraine <loraine@loraine.net>
Cc: Thomas Down <td2@sanger.ac.uk>, biojava-l@biojava.org
Subject: Re: [Biojava-l] build error w/ JBuilder
Organization: This tangled web on which I'm laid intwined

On Thu, Aug 17, 2000 at 01:19:54PM -0700, Ann Loraine wrote:
> 
> I just updated my copy of biojava and picked up your change.
> 
> Sorry to be a pain, but now I have another problem -- again w/
> building on Windows.  Has anyone else run into this, as well?
> 
> I'm not sure about this, but it seems that the xerces and Sun xml
> jars are perhaps incompatible.  (the ones from biojava Web site)
> 
> When I try to build biojava I get this error:
> 
> "cannot access class com.sun.xml.tree.ElementNode; no source found; must
> be compiled, because org.w3c.dom.Element.normalize referenced by
> class com.sun.xml.tree.ElementNode has changed."

Hmmm, are you trying to compile BioJava with both xml.jar (crimson)
and xerces.jar on your classpath.  I wouldn't expect these to be
compatible, since xml.jar includes the w3c DOM level 1 API, while
Xerces-J provides level 2.  Since the big switch a few days back,
the BioJava HEAD should now be compiled with xerces.jar -- there
should be no need for xml.jar as well, and I'll remove it from the
CVS repository soon.

If you use the latest version of the BioJava build tool, it will
compile with xerces.jar rather than xml.jar.

Is there some particular reason why you need Sun's xml.jar instead
of Xerces?  If this has created a clash, I'll have to look into ways
of sorting things out.  But generally, Xerces offers pretty much
everything that Crimson did, plus proper handling of XML namespaces
(this is the main change in the SAX2 and DOM level 2 APIs).

Let me know if there's any more trouble,
  Thomas

PS. For people using the release-1_0-branch, this still uses the
   old xml.jar.
-- 
One of the advantages of being disorderly is that one is
constantly making exciting discoveries.
                                       -- A. A. Milne

-- __--__-- 

Message: 4
From: Christian Gruber <Christian.Gruber@biomax.de>
Date: Fri, 18 Aug 2000 13:03:44 +0200 (CEST)
To: biojava-l@biojava.org
Subject: [Biojava-l] FASTA reader problem: "Mark invalid"

Hi!

I wrote a Java test program that just reads in a sequence in FASTA
format and prints the sequences out. I did this by making the
appropriate changes to the file demos/seq/TestEmbl.java. As a test
sequence file in FASTA format, i created one with random sequences.

Now the problem: There are some sequence files that are definitely
correct FASTA format files, but create the following error message:


---------------------------------------------


java.io.IOException: Mark invalid
        at java.lang.Throwable.<init>(Throwable.java:96)
        at java.lang.Exception.<init>(Exception.java:44)
        at java.io.IOException.<init>(IOException.java:49)
        at java.io.BufferedReader.reset(BufferedReader.java:473)
        at org.biojava.bio.seq.io.FastaFormat.readSequence(FastaFormat.java(Comp
iled Code))
        at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:88
)
rethrown as org.biojava.bio.BioException: Could not read sequence
        at java.lang.Throwable.<init>(Throwable.java:96)
        at java.lang.Exception.<init>(Exception.java:44)
        at org.biojava.bio.BioException.<init>(BioException.java:58)
        at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:90
)
        at TestFasta.main(TestFasta.java:29)



----------------------------------------------



The error always occurs reproducibly at the same location in the
file, but not with any FASTA file.


When anyone is interested in a FASTA file that creates this error, I
can send it (including my Java program source). It's ~15kb
altogether. Just send an email (Christian.Gruber@biomax.de).


Does anybody know what happens here?


Christian Gruber

-- __--__-- 

Message: 5
Date: Fri, 18 Aug 2000 12:46:44 +0100
From: Thomas Down <td2@sanger.ac.uk>
To: Christian Gruber <Christian.Gruber@biomax.de>
Cc: biojava-l@biojava.org
Subject: Re: [Biojava-l] FASTA reader problem: "Mark invalid"
Organization: This tangled web on which I'm laid intwined

On Fri, Aug 18, 2000 at 01:03:44PM +0200, Christian Gruber wrote:
> Hi!
> 
> I wrote a Java test program that just reads in a sequence in FASTA
> format and prints the sequences out. I did this by making the
> appropriate changes to the file demos/seq/TestEmbl.java. As a test
> sequence file in FASTA format, i created one with random sequences.
> 
> Now the problem: There are some sequence files that are definitely
> correct FASTA format files, but create the following error message:

I've looked at this, and it's definitely an issue with long
description lines.  The trouble is, there is no `end of entry'
marker in a FASTA file, so the reader has to grab a line, then
`push it back' onto the stream if it turns out to be the 
description line for the next sequence in the file.

In the current Java I/O framework, the standard way to do
this is using the mark() and reset() methods.  Unfortunately,
mark() takes a numerical argument, and reset() MAY fail is
more than that number of bytes have been read since the mark().
So effectively we have a fixed-length buffer issue.

I agree (!) that BioJava should be able to handle long description
lines (which are, as you pointed out, quite common in some areas).
As a TEMPORARY workaround, I've upped the readahead limit passed
to mark() from 120 to 1024 (this is done on both HEAD and
release-1_0-branch).  Given the state of the Java I/O infrastructure,
I think this is probably the best that can be done while maintaining
the current interface for BioJava SequenceFormat classes.

On the other hand, it should be possible to redesign the
BioJava sequence I/O to avoid this issue.  At the same time,
I can see potential for performance optimisations (at the moment,
most SequenceFormats tend to read one line at a time, convert
that to a SymbolList, then join them all together later -- can
this be improved?).  It would be kind-of nice if we could get
any updates to this framework out of the way well before we
branch for 1.1.  Does anyone else have any thoughts on this?

Thanks,
   Thomas.
-- 
One of the advantages of being disorderly is that one is
constantly making exciting discoveries.
                                       -- A. A. Milne


-- __--__-- 

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


End of Biojava-l Digest_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


--__--__--

Message: 2
Date: Sat, 19 Aug 2000 08:55:35 -0700 (PDT)
From: Ann Loraine <loraine@loraine.net>
To: Thomas Down <td2@sanger.ac.uk>
cc: biojava-l@biojava.org
Subject: Re: [Biojava-l] build error w/ JBuilder

> 
> Hmmm, are you trying to compile BioJava with both xml.jar (crimson)
> and xerces.jar on your classpath.  I wouldn't expect these to be
> compatible, since xml.jar includes the w3c DOM level 1 API, while
> Xerces-J provides level 2.  Since the big switch a few days back,
> the BioJava HEAD should now be compiled with xerces.jar -- there
> should be no need for xml.jar as well, and I'll remove it from the
> CVS repository soon.
> 
> If you use the latest version of the BioJava build tool, it will
> compile with xerces.jar rather than xml.jar.
> 
> Is there some particular reason why you need Sun's xml.jar instead
> of Xerces?  If this has created a clash, I'll have to look into ways
> of sorting things out.  But generally, Xerces offers pretty much
> everything that Crimson did, plus proper handling of XML namespaces
> (this is the main change in the SAX2 and DOM level 2 APIs).

The dp package in biojava-live/demos wants classes from Javasoft's
xml.jar.  So this is why I included it in my classpath.  

If I don't attempt to build the dp demo then all is well. 

-Ann


> 
> Let me know if there's any more trouble,
>   Thomas
> 
> PS. For people using the release-1_0-branch, this still uses the
>    old xml.jar.
> -- 
> One of the advantages of being disorderly is that one is
> constantly making exciting discoveries.
>                                        -- A. A. Milne
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 



--__--__--

_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l


End of Biojava-l Digest_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l