[Biojava-l] SCF file wont load from URL

Andy Yates ady at sanger.ac.uk
Fri Sep 1 17:46:35 UTC 2006


The only things I can suggest is to make sure that you can parse the file 
from the URL when it's on your local machine.

I've knocked up some code which I think shoudl work which is very similar 
to yours in principle the only major difference is that I've wrapped the 
InputStream in a BufferedInputStream which if anything will improve 
performance.

//-------
SCF scfFile = new SCF();
BufferedInputStream bis = null;
try {
 	bis = new BufferedInputStream(new 
URL("http://yoururl").openStream());
 	scfFile.load(bis);
}
catch(MalformedURLException e) {
 	System.err.println("Malformed URL");
 	e.printStackTrace();
}
catch(IOException e) {
 	System.err.println("Reading problem");
 	e.printStackTrace();
}
finally {
 	try {
 		bis.close();
 	}
 	catch(IOException e) {
 		System.err.println("Arrgh");
 		e.printStackTrace();
 	}
}
//-------

I know that this URL definatly works if nothing else does.

http://trace.ensembl.org/tmp/ml1B-a1798c05.q1c.scf.gz

To parse this file you'll have to pass wrap it all up in a GZIPInputStream 
such as:

BufferedInputStream bis = new BufferedInputStream(new 
GZIPInputStream(url.openStream()));

Hope that helps otherwise it seems like there might be quite an incidious 
problem in the code somewhere. In defence of the libraries my current 
group parses something like 30 traces per second at maximum input from a 
variety of resources including file based and URL based and we haven't 
encountered any problems. Normally if there is a problem with the parsing 
it is usually because the trace file is badly formed.

If you need to check this out try http://staden.sourceforge.net/ and 
staden io_lib which comes with the scf_dump program and trev which is a 
trace viewer.

Tell us how you get on

Andy

On Fri, 1 Sep 2006, K.R. Carter wrote:

> yes. here is the code snippet.
>
> SCF scfFile = new SCF();
>
> scfFile.load(new URL("
> http://www.conifergdb.org/software/wtm0.6/process/guest@muohio.edu_060713_052442/chromat_dir/COLD1_16_H12.b1_A029").openStream(),
> 0);
>
> I use this SCF class to open read files from the local machine and it works
> perfectly fine. The hangup occurs when I try and open one from the URL.
>
> i also put some system.out.println statements in the SCF class for debugging
> purposes.
>
> using v3 parser
> begin parsing...
> input stream not null
> parsing samples...
> reading samples into...
> reading samples into...
> reading samples into...
> reading samples into...
> parsing bases...
>
> seems like once it reaches parsing bases, then the problem occurs. i placed
> these println statments at the beginning of the methods. i put this:
> System.out.println("file is parsed... ");
> at the end of the parse() method:
>
> public void parse() throws IOException,
>               UnsupportedChromatogramFormatException {
>           parsed = false;
>           // sort the sections of the file by ascending offset
>           Integer SAMPLES  = new Integer(0),
>                   BASES    = new Integer(1),
>                   COMMENTS = new Integer(2),
>                   PRIVATE  = new Integer(3);
>           TreeMap sectionOrder = new TreeMap();
>           sectionOrder.put(new Long(header.samples_offset),  SAMPLES);
>           sectionOrder.put(new Long(header.bases_offset),    BASES);
>           sectionOrder.put(new Long(header.comments_offset), COMMENTS);
>           sectionOrder.put(new Long(header.private_offset),  PRIVATE);
>
>           for (Iterator it = sectionOrder.keySet().iterator() ;
>           it.hasNext() ;) {
>               Integer sect = (Integer) sectionOrder.get(it.next());
>               if      (sect == SAMPLES)  parseSamples();
>               else if (sect == BASES)    parseBases();
>               else if (sect == COMMENTS) parseComments();
>               else if (sect == PRIVATE)  parsePrivate();
>           }
>           parsed = true;
>           System.out.println("file is parsed... "); <-- i added this
>       }
>
> That statement ("file is parsed") never gets printed. I believe the hangup
> is at parseBases(). I'm unsure. I'm still trying to find out exactly where
> it is.
>
> Kiki
>
> On 9/1/06, Andy Yates <ady at sanger.ac.uk> wrote:
>> 
>> Is it possible to send the snippit of code that you're running at all?
>> 
>> Andy
>> 
>> On Fri, 1 Sep 2006, K.R. Carter wrote:
>> 
>> > Thanks Andy,
>> >
>> > I programmatically set the proxy however, it still does not solve the
>> > problem.
>> >
>> > On 8/31/06, Andy Yates <ady at sanger.ac.uk> wrote:
>> >>
>> >> That sounds like http proxy problems in my book.
>> >>
>> >> Try looking at this page: http://mindprod.com/jgloss/proxy.html
>> >>
>> >> The main thing to take home is try setting the system properties:
>> >>
>> >> proxySet=true
>> >> http.proxyHost=proxyHostName
>> >> http.proxyPort=proxyHostPort
>> >>
>> >> You can do this programatically using the System.setProperty() method
>> or
>> >> with -DpropertyName=propertyValue from the command line.
>> >>
>> >> Hope that helps,
>> >>
>> >> Andy Yates
>> >>
>> >> mark.schreiber at novartis.com wrote:
>> >> > Hi -
>> >> >
>> >> > This sounds very strange. Is there any stack trace? Could you
>> possibly
>> >> > post the code that recreates the problem?
>> >> >
>> >> > - Mark
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > "K.R. Carter" <k_stellar at msn.com>
>> >> > Sent by: biojava-l-bounces at lists.open-bio.org
>> >> > 08/31/2006 04:34 AM
>> >> > Please respond to kikia.reneese
>> >> >
>> >> >
>> >> >         To:     biojava-l at biojava.org
>> >> >         cc:     (bcc: Mark Schreiber/GP/Novartis)
>> >> >         Subject:        [Biojava-l] SCF file wont load from URL
>> >> >
>> >> >
>> >> > Hello,
>> >> >
>> >> > I am trying to load an scf file by using the input stream from a url
>> and
>> >> > it
>> >> > will not load. Does anyone know what might be happening? My program
>> >> doesnt
>> >> > give an error, it just completely freezes.  I am using the latest ( i
>> >> > think)
>> >> > version of SCF class.
>> >> >
>> >> >
>> >> > /**
>> >> >  * A {@link org.biojava.bio.chromatogram.Chromatogram} as loaded from
>> an
>> >> >  * SCF v2 or v3 file.  Also loads and exposes the SCF format's
>> "private
>> >> > data"
>> >> >  * and "comments" sections.  The quality values from the SCF are
>> stored
>> >> as
>> >> >  * additional sequences on the base call alignment. The labels are
>> the
>> >> >  * <code>PROB_</code>* constants in this class.
>> >> >  * The values are {@link
>> >> > org.biojava.bio.symbol.IntegerAlphabet.IntegerSymbol}
>> >> >  * objects in the range 0 to 255.
>> >> >  *
>> >> >  *
>> >> >  * @author Rhett Sutphin (<a href="http://genome.uiowa.edu/">UI
>> >> CBCB</a>)
>> >> >  */
>> >> >
>> >> > any help would be greatly appreciated.
>> >> >
>> >> > Thanks!
>> >> > _______________________________________________
>> >> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >> >
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>> >> > http://lists.open-bio.org/mailman/listinfo/biojava-l
>> >> >
>> >>
>> >
>> 
>



More information about the Biojava-l mailing list