AW: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?

Smithies, Russell Russell.Smithies at agresearch.co.nz
Tue Jun 28 20:11:23 EDT 2005


Easiest method (if you don't care about validating) is delete the DTD line in the XML.
If you do need to validate, ensure you have your proxy settings stet correctly so the parser can access the DTD.

Russell 

-----Original Message-----
From: biojava-l-bounces at portal.open-bio.org [mailto:biojava-l-bounces at portal.open-bio.org] On Behalf Of Sébastien PETIT
Sent: Wednesday, 29 June 2005 2:50 a.m.
To: biojava-l at biojava.org
Subject: RE: AW: AW: [Biojava-l] BLAST Parser for extracting all BLAST data?

I try the code you sent me. I just change the path of the XML file.
But, in this file, there is this line :

<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN"
"NCBI_BlastOutput.dtd">

and I have exceptions and errors because of this line.

If you want, I send the XML file so that you test it...

But, I download the DTD and the MOD files necessary, I modified the DTD
file a little bit, and it works...
But, I would prefer to not have those files with my code...

Thank you...

Sebastien

--- "BIBIS, Garnier, Christophe" <cgarnier at ttz-Bremerhaven.de> a écrit
:

> Did you try just the code i sent you? Or did you integrate it inside
> your
> program?
> 
> As far as i know, jdom works without dtd files: it makes no control
> on the
> structure of the file
> It should word because I tested it without using the corresponding
> dtd file.
> 
> 
> christophe
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Sébastien PETIT [mailto:great_fred at yahoo.com]
> Gesendet: Dienstag, 28. Juni 2005 15:00
> An: biojava-l at biojava.org
> Betreff: RE: AW: [Biojava-l] BLAST Parser for extracting all BLAST
> data?
> 
> 
> Thank you for JDOM and the code...
> But, it generates a ton of exceptions and error because it doesn't
> find
> a DTD file (NCBI_BlastOutput.dtd) that I don't have...
> 
> So, I don't know how to do...
> 
> Sebastien
> 
> --- "BIBIS, Garnier, Christophe" <cgarnier at ttz-Bremerhaven.de> a
> écrit
> :
> 
> > 
> > if you don't find what you need through biojava, you can always
> write
> > a
> > small xml parser with for example jdom.
> > 
> > 1 - download jdom.jar
> > 2 - use the following code to find <Hsp_midline>:
> > 3 - replace the path of the xml file in the main method
> > 4 - it prints out every found Element
> > 
> > 
> > I hope it helps you
> > 
> > Best,
> > Christophe
> > 
> > +++++++++++++++++++++++++++++++++++++
> > 
> > import java.io.File;
> > import java.io.IOException;
> > import java.util.Iterator;
> > import java.util.List;
> > 
> > import org.jdom.Document;
> > import org.jdom.Element;
> > import org.jdom.JDOMException;
> > import org.jdom.input.SAXBuilder;
> > 
> > public class JDomParser
> > {
> > 
> > 	private static void parseResults(Element iterations)
> > 	{
> > 		System.out.println("*** parseResults ***") ;
> > 		
> > 		Element it = iterations.getChild("Iteration") ;
> > 		
> > 		List elts = it.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			System.out.println(child + " - " + child.getText() +
> > " - "
> > 					+ child.getName());
> > 
> > 			if ( child.getName().equals("Iteration_hits"))
> > 			{
> > 				parseHits(child) ;
> > 			}
> > 			
> > 			if ( child.getName().equals("Iteration_stat"))
> > 			{
> > 				parseStatistics(child) ;
> > 			}
> > 			
> > 		
> > 		}
> > 	}
> > 
> > 	private static void parseHits(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			parseHit(child) ;
> > 			
> > 		}
> > 	}
> > 	
> > 	private static void parseHspHit(Element element)
> > 	{
> > 		Element hsp = element.getChild("Hsp") ;
> > 
> > 		List hsps = hsp.getChildren();
> > 		
> > 		Iterator iterator = hsps.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 		}
> > 	}
> > 	
> > 	private static void printElt(Element elt)
> > 	{
> > 		System.out.println("Element: [" + elt.getName() + "] -
> > text:" + elt.getText() ) ;
> > 	}
> > 	
> > 	private static void parseHit(Element element)
> > 	{
> > 		List elts = element.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 			if (child.getName().equals("Hit_hsps"))
> > 					{
> > 					parseHspHit(child) ;
> > 					}
> > 			
> > 		}
> > 	}
> > 	
> > 	
> > 	private static void parseStatistics(Element element)
> > 	{
> > 		Element stat = element.getChild("Statistics") ;
> > 		
> > 		List elts = stat.getChildren();
> > 		
> > 		Iterator iterator = elts.iterator();
> > 		
> > 		while (iterator.hasNext())
> > 		{
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 			
> > 		}
> > 		
> > 	}
> > 	
> > 	
> > 	public static void parseFile(File file) throws JDOMException,
> > IOException
> > 	{
> > 		SAXBuilder parser = new SAXBuilder();
> > 		Document doc = parser.build(file);
> > 
> > 		Element root = doc.getRootElement();
> > 
> > 		List elts = root.getChildren();
> > 		Iterator iterator = elts.iterator();
> > 
> > 		int index = 0;
> > 		while (iterator.hasNext())
> > 		{
> > 
> > 			Element child = (Element) iterator.next();
> > 
> > 			printElt(child) ;
> > 
> > 			if
> > (child.getName().equals("BlastOutput_iterations"))
> > 				parseResults(child);
> > 
> 
=== message truncated ===



	

	
		
___________________________________________________________________________ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
Téléchargez cette version sur http://fr.messenger.yahoo.com
_______________________________________________
Biojava-l mailing list  -  Biojava-l at biojava.org
http://biojava.org/mailman/listinfo/biojava-l
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================



More information about the Biojava-l mailing list