[Biojava-l] need help for SimpleSequenceBuilder class
Cox, Greg
gcox@netgenics.com
Mon, 23 Jul 2001 08:44:31 -0400
Okay, this has already been fixed for RefSeq files. The demo TestRefSeqPrt
has the code for parsing these files, and I've attached the relevant
snippet. "**" marks what should change. Let me know if you run into any
problems with RefSeq; you get to be tester #1.
Greg
SequenceFormat gFormat = new GenbankFormat();
BufferedReader gReader = new BufferedReader(
new InputStreamReader(new FileInputStream(genbankFile)));
** SequenceBuilderFactory sbFact =
** new ProteinRefSeqProcessor.Factory(SimpleSequenceBuilder.FACTORY);
Alphabet alpha = ProteinTools.getTAlphabet();
SymbolParser rParser = alpha.getParser("token");
SequenceIterator seqI =
new StreamReader(gReader, gFormat, rParser, sbFact);
> -----Original Message-----
> From: Cox, Greg [mailto:gcox@netgenics.com]
> Sent: Monday, July 23, 2001 8:06 AM
> To: 'xling@tularik.com'; td2@sanger.ac.uk
> Cc: biojava-l@biojava.org
> Subject: RE: [Biojava-l] need help for SimpleSequenceBuilder class
>
>
> Hi Bruce,
> I've done a lot with Genbank files, and the problem isn't
> actually in
> SimpleSequenceBuilder, that's just the symptom. The feature
> table renderer
> builds a stranded feature by default, and that's not acceptable for
> proteins. I'll look into your case, and try to get a fix
> into CVS later
> today.
>
> Greg
>
> -----Original Message-----
> From: Bruce Ling [mailto:xling@tularik.com]
> Sent: Sunday, July 22, 2001 11:17 AM
> To: td2@sanger.ac.uk
> Cc: biojava-l@biojava.org
> Subject: [Biojava-l] need help for SimpleSequenceBuilder class
>
>
> Hi, Thomas,
>
> As I saw the doc says you are the author of
> SimpleSequenceBuilder class, I
> am asking for help with the following problem?
>
> I am in the way of using biojava GenbankFormat class, the code is as
> following:
>
> {
> SequenceFormat gFormat = new GenbankFormat();
> SequenceBuilderFactory sbFact =
> new GenbankProcessor.Factory(SimpleSequenceBuilder.FACTORY);
> //Alphabet alpha = DNATools.getDNA();
> //this following line does not work for protein, need more
> work to figure
> out the library
> Alphabet alpha = ProteinTools.getAlphabet();
> SymbolParser rParser = alpha.getParser("token");
> seqI =
> new StreamReader(gReader, gFormat, rParser, sbFact);
>
> }
>
> see the commented out part, if I am using a DNA genbank file
> as the one
> sample in the demo part it works fine. But if I want to use
> the above code
> to use PROTEIN alphabet and parse a protein record in genbank
> format such
> as:
> http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=NP_005154
> <http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=NP_00
> 5154&form=6&db
> =p&Dopt=g> &form=6&db=p&Dopt=g
>
> it gives the exception shown at the end of the email.
>
> I have traced down and problem is at:
> SimpleSequenceBuilder class TemplateWithChildren. It seems
> by default it
> assumes this is a DNA genbank record. that is why it is
> trying to create a
> strand feature which protein record does not have it.
>
> public Sequence makeSequence() {
> SymbolList symbols = slBuilder.makeSymbolList();
> Sequence seq = new SimpleSequence(symbols, uri, name, annotation);
> try {
> for (Iterator i = rootFeatures.iterator(); i.hasNext(); ) {
> TemplateWithChildren twc = (TemplateWithChildren) i.next();
> Feature f = seq.createFeature(twc.template);
> if (twc.children != null) {
> makeChildFeatures(f, twc.children);
> }
> }
> } catch (Exception ex) {
> throw new BioError(ex, "Couldn't create feature");
> }
> return seq;
> }
>
> ==================================
> java Exceptions
> ==================================
> java.lang.reflect.InvocationTargetException:
> org.biojava.bio.symbol.IllegalAlphabetException: Can not
> create a stranded
> feature within a sequence of type PROTEIN
>
> at
> org.biojava.bio.seq.impl.SimpleStrandedFeature.<init>(SimpleSt
> randedFeature.
> java:76)
>
> at java.lang.reflect.Constructor.newInstance(Native Method)
>
> at
> org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize
> (SimpleFeature
> Realizer.java:136)
>
> rethrown as org.biojava.bio.BioException: Couldn't realize feature
>
> at
> org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize
> (SimpleFeature
> Realizer.java:138)
>
> at
> org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(Simpl
> eFeatureRealiz
> er.java:92)
>
> at
> org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleS
> equence.java:1
> 76)
>
> at
> org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSe
> quence.java:18
> 2)
>
> at
> org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(Simp
> leSequenceBuil
> der.java:154)
>
> rethrown as org.biojava.bio.BioError: Couldn't create feature
>
> at
> org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(Simp
> leSequenceBuil
> der.java:160)
>
> at
> org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(Sequ
> enceBuilderFil
> ter.java:98)
>
> at
> org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.
> java:100)
>
>
>
>
>
>
>
> Thanks.
>
> Bruce Ling, Ph.D.
> Director, Bioinformatics
> Tularik, Inc -- http://www.tularik.com <http://www.tularik.com/>
> Email: bruce@tularik.com
> Phone: 650-825-7143
> fax: 1-435-804-4009
>
>
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>