[Biojava-l] Parsing location from gbk files

Matthew Pocock matthew_pocock at yahoo.co.uk
Thu Sep 11 05:58:16 EDT 2003


Schreiber, Mark wrote:

>Hi -
> 
>This is a recurring problem as more files are being deposited that contain only references to other contigs but contain annotation and features.
> 
>A model where a DummySequence of appropriate length is instantiated if the contig files are not available may be useful. Possibley a GBS parser would be useful? Has anyone hacked out something like this?
> 
>- Mark
>  
>
It should be possible to create a dummy sequence of the right length and 
then populate the feature table with component features for the assembly 
info. Perhaps if we wrote a ComponentFeature that held an LSID - then 
tries to resolve itself on demand via the LSID framework?

I don't have the time right now, but somebody should get keen.

Matthew

>	-----Original Message----- 
>	From: charles.girardot at libertysurf.fr [mailto:charles.girardot at libertysurf.fr] 
>	Sent: Thu 11/09/2003 6:13 a.m. 
>	To: biojava-l 
>	Cc: 
>	Subject: [Biojava-l] Parsing location from gbk files
>	
>	
>
>	Hi,
>	
>	I am using biojava to extract mRNA, CDS and gene boundaries from genbank annotation files (e.g., "hs_chr19.gbs" from ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/CHR_19/).
>	I've just switched from the biojava 1.22 to latest 1.3 release (biojava-1.30-jdk14.jar and bytecode-0.92.jar)
>	and I get lots of exception (see below, the peace of code generating this follows as well). Don't known if it's important but ".gbs" files do not hold the contig sequences but only annotations.
>	
>	Nota: I am using the jdk 1.4.1.
>	
>	Thanks for your help.
>	
>	Charles Girardot
>	
>	
>	=======================================
>	java.lang.IllegalArgumentException: Location 59817 is outside 1..0
>	        at org.biojava.bio.seq.impl.SimpleFeature.<init>(SimpleFeature.java:306)
>	        at org.biojava.bio.seq.impl.SimpleStrandedFeature.<init>(SimpleStrandedFeature.java:74)
>	        at sun.reflect.GeneratedConstructorAccessor1.newInstance(Unknown Source)
>	        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>	        at java.lang.reflect.Constructor.newInstance(Constructor.java:274)
>	        at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:138)
>	rethrown as org.biojava.bio.BioException: Couldn't realize feature
>	        at org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeatureRealizer.java:144)
>	        at org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealizer.java:94)
>	        at org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:198)
>	        at org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:204)
>	        at org.biojava.bio.seq.io.SequenceBuilderBase.makeSequence(SequenceBuilderBase.java:168)
>	        at org.biojava.bio.seq.io.SmartSequenceBuilder.makeSequence(SmartSequenceBuilder.java:87)
>	        at org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFilter.java:98)
>	        at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:101)
>	        at org.gc.test.GenbankParserTest.main(GenbankParserTest.java:34)
>	       
>	
>	
>	==========================================
>	
>	package org.gc.test;
>	
>	import java.io.BufferedReader;
>	import java.io.File;
>	import java.io.FileReader;
>	
>	import org.biojava.bio.seq.Sequence;
>	import org.biojava.bio.seq.SequenceIterator;
>	import org.biojava.bio.seq.io.SeqIOTools;
>	
>	/**
>	 * @author unknown
>	 *
>	 * To change the template for this generated type comment go to
>	 * Window&gt;Preferences&gt;Java&gt;Code Generation&gt;Code and Comments
>	 */
>	public class GenbankParserTest {
>	
>	public static void main(String[] args) {
>	System.out.println("Main starts ...");
>	        try {
>	
>	        File f = new File("D:\\Travail\\data\\Hs_chr19.gbs");
>	        BufferedReader bf = new BufferedReader(new FileReader(f));
>	        SequenceIterator it = SeqIOTools.readGenbank(bf);
>	        while (it.hasNext()) {
>	                System.out.println("getting next seq...");
>	                Sequence s = it.nextSequence();
>	        }
>	        } catch (Exception e) {
>	                System.out.println(e.getStackTrace());
>	        }
>	System.out.println("End of main");
>	}
>	}
>	
>	
>	********** L'ADSL A 20 EUR/MOIS**********
>	Avec Tiscali, l'ADSL est à 20 EUR/mois. Vous pourrez chercher longtemps avant de trouver moins cher !
>	Pour profiter de cette offre exceptionnelle, cliquez ici : http://register.tiscali.fr/adsl/
>	Offre soumise à conditions.
>	
>	
>	
>	_______________________________________________
>	Biojava-l mailing list  -  Biojava-l at biojava.org
>	http://biojava.org/mailman/listinfo/biojava-l
>	
>
>
>=======================================================================
>Attention: The information contained in this message and/or attachments
>from AgResearch Limited is intended only for the persons or entities
>to which it is addressed and may contain confidential and/or privileged
>material. Any review, retransmission, dissemination or other use of, or
>taking of any action in reliance upon, this information by persons or
>entities other than the intended recipients is prohibited by AgResearch
>Limited. If you have received this message in error, please notify the
>sender immediately.
>=======================================================================
>
>_______________________________________________
>Biojava-l mailing list  -  Biojava-l at biojava.org
>http://biojava.org/mailman/listinfo/biojava-l
>
>  
>




More information about the Biojava-l mailing list