[Biojava-l] need help for SimpleSequenceBuilder class
Bruce Ling
xling@tularik.com
Sun, 22 Jul 2001 08:16:37 -0700
This is a multi-part message in MIME format.
------=_NextPart_000_0000_01C11286.A1091850
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Hi, Thomas,
As I saw the doc says you are the author of SimpleSequenceBuilder class, I
am asking for help with the following problem?
I am in the way of using biojava GenbankFormat class, the code is as
following:
{
SequenceFormat gFormat = new GenbankFormat();
SequenceBuilderFactory sbFact =
new GenbankProcessor.Factory(SimpleSequenceBuilder.FACTORY);
//Alphabet alpha = DNATools.getDNA();
//this following line does not work for protein, need more work to figure
out the library
Alphabet alpha = ProteinTools.getAlphabet();
SymbolParser rParser = alpha.getParser("token");
seqI =
new StreamReader(gReader, gFormat, rParser, sbFact);
}
see the commented out part, if I am using a DNA genbank file as the one
sample in the demo part it works fine. But if I want to use the above code
to use PROTEIN alphabet and parse a protein record in genbank format such
as:
http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=NP_005154&form=6&db=
p&Dopt=g
it gives the exception shown at the end of the email.
I have traced down and problem is at:
SimpleSequenceBuilder class TemplateWithChildren. It seems by default it
assumes this is a DNA genbank record. that is why it is trying to create a
strand feature which protein record does not have it.
public Sequence makeSequence() {
SymbolList symbols = slBuilder.makeSymbolList();
Sequence seq = new SimpleSequence(symbols, uri, name, annotation);
try {
for (Iterator i = rootFeatures.iterator(); i.hasNext(); ) {
TemplateWithChildren twc = (TemplateWithChildren) i.next();
Feature f = seq.createFeature(twc.template);
if (twc.children != null) {
makeChildFeatures(f, twc.children);
}
}
} catch (Exception ex) {
throw new BioError(ex, "Couldn't create feature");
}
return seq;
}
==================================
java Exceptions
==================================
java.lang.reflect.InvocationTargetException:
org.biojava.bio.symbol.IllegalAlphabetException: Can not create a stranded
feature within a sequence of type PROTEIN
at
org.biojava.bio.seq.impl.SimpleStrandedFeature.<init>(SimpleStrandedFeature.
java:76)
at java.lang.reflect.Constructor.newInstance(Native Method)
at
org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeature
Realizer.java:136)
rethrown as org.biojava.bio.BioException: Couldn't realize feature
at
org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeature
Realizer.java:138)
at
org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRealiz
er.java:92)
at
org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.java:1
76)
at
org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java:18
2)
at
org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(SimpleSequenceBuil
der.java:154)
rethrown as org.biojava.bio.BioError: Couldn't create feature
at
org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(SimpleSequenceBuil
der.java:160)
at
org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilderFil
ter.java:98)
at org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:100)
Thanks.
Bruce Ling, Ph.D.
Director, Bioinformatics
Tularik, Inc -- http://www.tularik.com
Email: bruce@tularik.com
Phone: 650-825-7143
fax: 1-435-804-4009
------=_NextPart_000_0000_01C11286.A1091850
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2462.0" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>Hi,=20
Thomas,</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>As I =
saw the doc=20
says you are the author of SimpleSequenceBuilder class, I am =
asking for=20
help</SPAN></FONT><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001> with the following =
problem?</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>I am =
in the way of=20
using biojava GenbankFormat class, the code is as =
following:</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001> {<BR> SequenceFormat =
gFormat =3D=20
new GenbankFormat();<BR> SequenceBuilderFactory sbFact=20
=3D<BR> new=20
GenbankProcessor.Factory(SimpleSequenceBuilder.FACTORY);<BR> &=
nbsp;//Alphabet=20
alpha =3D DNATools.getDNA();<BR>//this following line does not work for =
protein,=20
need more work to figure out the=20
library<BR> &n=
bsp; =20
Alphabet alpha =3D =
ProteinTools.getAlphabet();<BR> SymbolParser=20
rParser =3D alpha.getParser("token");<BR> seqI=20
=3D<BR> new StreamReader(gReader, =
gFormat,=20
rParser, sbFact);</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001> &nbs=
p; =20
}</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>see =
the commented=20
out part, if I am using a DNA genbank file as the one sample in the demo =
part it=20
works fine. But if I want to use the above code to use PROTEIN =
alphabet=20
and parse a protein record in genbank format such as: =
</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001><A=20
href=3D"http://www.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=3DNP_0051=
54&form=3D6&db=3Dp&Dopt=3Dg">http://www.ncbi.nlm.nih.gov/htbi=
n-post/Entrez/query?uid=3DNP_005154&form=3D6&db=3Dp&Dopt=3Dg<=
/A></SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>it =
gives the=20
exception shown at the end of the email.</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>I have =
traced down=20
and problem is at:</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001>SimpleSequenceBuilder class =
TemplateWithChildren. =20
It seems by default it assumes this is a DNA genbank record. that is why =
it is=20
trying to create a strand feature which protein record does not have=20
it.</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001></SPAN></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> public Sequence =
makeSequence()=20
{<BR> SymbolList symbols =3D =
slBuilder.makeSymbolList();<BR> Sequence=20
seq =3D new SimpleSequence(symbols, uri, name, annotation);<BR> try =
{<BR> for (Iterator i =3D =
rootFeatures.iterator();=20
i.hasNext(); ) {<BR> TemplateWithChildren twc =3D=20
(TemplateWithChildren) i.next();<BR> Feature f =3D=20
seq.createFeature(twc.template);<BR> if (twc.children !=3D =
null)=20
{<BR> makeChildFeatures(f,=20
twc.children);<BR> }<BR> }<BR> } =
catch=20
(Exception ex) {<BR> throw new BioError(ex, =
"Couldn't=20
create feature");<BR> }<BR> return seq;<BR> =20
}<BR></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</SPAN></FONT></DIV=
>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D504365914-22072001>java=20
Exceptions</SPAN></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN=20
class=3D504365914-22072001>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D</SPAN></FONT></DIV=
>
<DIV><FONT face=3DArial =
size=3D2>java.lang.reflect.InvocationTargetException:=20
org.biojava.bio.symbol.IllegalAlphabetException: Can not create a =
stranded=20
feature within a sequence of type PROTEIN</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.impl.SimpleStrandedFeature.<init>(SimpleStrande=
dFeature.java:76)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
java.lang.reflect.Constructor.newInstance(Native Method)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeat=
ureRealizer.java:136)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>rethrown as =
org.biojava.bio.BioException: Couldn't=20
realize feature</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.SimpleFeatureRealizer$TemplateImpl.realize(SimpleFeat=
ureRealizer.java:138)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.SimpleFeatureRealizer.realizeFeature(SimpleFeatureRea=
lizer.java:92)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.impl.SimpleSequence.realizeFeature(SimpleSequence.jav=
a:176)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.impl.SimpleSequence.createFeature(SimpleSequence.java=
:182)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(SimpleSequenceB=
uilder.java:154)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>rethrown as org.biojava.bio.BioError: =
Couldn't=20
create feature</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.io.SimpleSequenceBuilder.makeSequence(SimpleSequenceB=
uilder.java:160)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.io.SequenceBuilderFilter.makeSequence(SequenceBuilder=
Filter.java:98)</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> at=20
org.biojava.bio.seq.io.StreamReader.nextSequence(StreamReader.java:100)<B=
R></FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2> </DIV></FONT>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<P><FONT size=3D2>Thanks.<BR><BR>Bruce Ling, Ph.D.<BR>Director,=20
Bioinformatics<BR>Tularik, Inc -- <A href=3D"http://www.tularik.com/"=20
target=3D_blank>http://www.tularik.com</A><BR>Email: =
bruce@tularik.com<BR>Phone:=20
650-825-7143<BR>fax: 1-435-804-4009</FONT> </P>
<DIV> </DIV></BODY></HTML>
------=_NextPart_000_0000_01C11286.A1091850--