[Biojava-l] error using the proteomic package
Robert Stones
r.stones at csl.gov.uk
Wed Jun 25 16:40:58 EDT 2003
I keep getting this error using the proteomic package when performing
simulated digests on proteins and some peptides with masses are not
being generated:
Has anybody got an example code for setting proteases and working out
masses?
org.biojava.bio.symbol.IllegalSymbolException: No mass Set for Symbol
[GLU SER GLY PHE LEU PRO ALA VAL ASN SEC HIS MET THR TYR ARG ILE LYS GLN
TRP CYS ASP]
at org.biojava.bio.proteomics.MassCalc.getVMasses(MassCalc.java:365)
my code
try
{
seq = si.nextSequence();
Annotation anno = seq.getAnnotation();
it = seq.features();
}//try
catch (BioException bex)
{
System.out.println(bex);
}
while ( it.hasNext() )
{
Feature f = (Feature) it.next();
if (f.getType().equals("Peptide"))
{
try
{
MassCalc massCalc = new MassCalc(SymbolPropertyTable.MONO_MASS,
true);
double[] masses = massCalc.getVariableMasses(f.getSymbols());
double mass = masses[0];
}//try
catch(Exception is)
{
is.printStackTrace();
}
}
}
biojava-l-request at biojava.org wrote:
>
> Send Biojava-l mailing list submissions to
> biojava-l at biojava.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://biojava.org/mailman/listinfo/biojava-l
> or, via email, send a message with subject or body 'help' to
> biojava-l-request at biojava.org
>
> You can reach the person managing the list at
> biojava-l-owner at biojava.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Biojava-l digest..."
>
> Today's Topics:
>
> 1. Re: Remove features from a sequence (Matthew Pocock)
> 2. TranslatedRegion (Matthew Muller)
> 3. RE: Remove features from a sequence (Schreiber, Mark)
> 4. Re: SAX parser demo (David Huen)
> 5. Re: SAX parser demo (Russell Smithies)
> 6. RE: SAX parser demo (Schreiber, Mark)
> 7. Re: SAX parser demo (David Huen)
> 8. Re: TranslatedRegion (Thomas Down)
> 9. Re: SAX parser demo (David Huen)
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 24 Jun 2003 19:16:38 +0100
> From: Matthew Pocock <matthew_pocock at yahoo.co.uk>
> Subject: Re: [Biojava-l] Remove features from a sequence
> To: Thomas Down <thomas at derkholm.net>
> Cc: "Schreiber, Mark" <mark.schreiber at agresearch.co.nz>,
> biojava-l at biojava.org, Keith James <kdj at sanger.ac.uk>
> Message-ID: <3EF89586.5060900 at yahoo.co.uk>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Any way we slice this, we're going to end up implementing an iterator,
> right? The code is going to be messy - doing all the change notification
> and stuff - we can't call the normal remove features method, as that
> will barf the iterator, which leaves us with writing the notification
> code again inside the iterator remove(). Code duplication - twice the
> code, twice the bugs?
>
> So, on balance, I guess we should throw NotImplementedException on
> remove(). Unless somebody has a bright idea?
>
> Well, yet again, we've been reminded that all interfaces should throw
> exceptions for all methods that can fail. Write, read out loud, repeat.
>
> Matthew
>
> Thomas Down wrote:
> > Once upon a time, Keith James wrote:
> >
> >>It may also be worth noting that you /can/ do this:
> >>
> >>while (seqI.hasNext())
> >>{
> >> Sequence seq = seqI.nextSequence();
> >>
> >> for (Iterator i = seq.features(); i.hasNext();)
> >> {
> >> i.next();
> >> i.remove();
> >> }
> >>}
> >>
> >>which is a way to avoid ConcurrentModificationException, but also a
> >>way to avoid informing any listeners to Sequence that all its Features
> >>have been stripped - and is likely to be bad.
> >
> >
> > Ugh, that's actually pretty nasty, to the extent that I would
> > call it a bug. My initial thought was to fix it by firing the
> > appropriate events, but this doesn't work since the Iterator.remove()
> > method can't throw a ChangeVetoException. So the options are:
> >
> > - Hack it (throw something like IllegalStateException)
> >
> > - Add ChangeVetoRuntimeException (seems like a Bad Thing to
> > me since we've always said that change vetoes are checked
> > exceptions up until now.
> >
> > - Forbid Iterator.remove(). Easiest to code, but seems like
> > a shame.
> >
> > None of these options seems terribly appetizing to me, but I
> > think we should do something.
> >
> > Any preferences?
> >
> > Thomas.
> > _______________________________________________
> > Biojava-l mailing list - Biojava-l at biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
>
> --
> BioJava Consulting LTD - Support and training for BioJava
> http://www.biojava.co.uk
>
> ------------------------------
>
> Message: 2
> Date: Tue, 24 Jun 2003 13:24:34 -0700
> From: "Matthew Muller" <mmuller at nuvelo.com>
> Subject: [Biojava-l] TranslatedRegion
> To: <biojava-l at biojava.org>
> Message-ID:
> <FF879B918856DE4FBC578B2AC5B9C7ECA674F5 at EVS01.almanor.sbh.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I was using org.biojava.bio.seq.genomic.TranslatedRegion to bind a protein sequence to it's mRNA sequence. I see that in BioJava 1.3, the whole package has been dropped.
>
> I want an object model that links mRNA, protein translations, and Features of both the mRNA and Protein.
>
> I'm guessing that the capability is still there but the classes are too abstract for me to understand. I suspect I need to use a FramedFeature. FramedFeatures implement the concept of a Translated Region but has no place for the Protein Sequence itself.
>
> Thanks!
>
> Matthew Muller
> Nuvelo
> mullermw at nuvelo.com
> tel: 408/215-4503
> fax: 408/524-8129
>
> ------------------------------
>
> Message: 3
> Date: Wed, 25 Jun 2003 09:48:50 +1200
> From: "Schreiber, Mark" <mark.schreiber at agresearch.co.nz>
> Subject: RE: [Biojava-l] Remove features from a sequence
> To: "Matthew Pocock" <matthew_pocock at yahoo.co.uk>
> Cc: biojava-l at biojava.org
> Message-ID:
> <AF026AF0FF4B054590228FD1F1DE5165016E79EF at inbox.agresearch.co.nz>
> Content-Type: text/plain; charset="us-ascii"
>
> > -----Original Message-----
> > From: Matthew Pocock [mailto:matthew_pocock at yahoo.co.uk]
> > Sent: Wednesday, 25 June 2003 6:17 a.m.
> > To: Thomas Down
> > Cc: Keith James; biojava-l at biojava.org; Schreiber, Mark
> > Subject: Re: [Biojava-l] Remove features from a sequence
> >
> >
> > Any way we slice this, we're going to end up implementing an
> > iterator,
> > right? The code is going to be messy - doing all the change
> > notification
> > and stuff - we can't call the normal remove features method, as that
> > will barf the iterator, which leaves us with writing the notification
> > code again inside the iterator remove(). Code duplication - twice the
> > code, twice the bugs?
> >
> > So, on balance, I guess we should throw NotImplementedException on
> > remove(). Unless somebody has a bright idea?
>
> I think this is the best way to do it. You could even get the method to
> throw a concurrent modification exception though that may be confusing.
> We should also document it somewhere with a big "don't do this!!!"
>
> >
> > Well, yet again, we've been reminded that all interfaces should throw
> > exceptions for all methods that can fail. Write, read out
> > loud, repeat.
>
> Amen and Amen
>
> - Mark
>
> >
> > Matthew
> >
> > Thomas Down wrote:
> > > Once upon a time, Keith James wrote:
> > >
> > >>It may also be worth noting that you /can/ do this:
> > >>
> > >>while (seqI.hasNext())
> > >>{
> > >> Sequence seq = seqI.nextSequence();
> > >>
> > >> for (Iterator i = seq.features(); i.hasNext();)
> > >> {
> > >> i.next();
> > >> i.remove();
> > >> }
> > >>}
> > >>
> > >>which is a way to avoid ConcurrentModificationException, but also a
> > >>way to avoid informing any listeners to Sequence that all
> > its Features
> > >>have been stripped - and is likely to be bad.
> > >
> > >
> > > Ugh, that's actually pretty nasty, to the extent that I
> > would call it
> > > a bug. My initial thought was to fix it by firing the appropriate
> > > events, but this doesn't work since the Iterator.remove()
> > method can't
> > > throw a ChangeVetoException. So the options are:
> > >
> > > - Hack it (throw something like IllegalStateException)
> > >
> > > - Add ChangeVetoRuntimeException (seems like a Bad Thing to
> > > me since we've always said that change vetoes are checked
> > > exceptions up until now.
> > >
> > > - Forbid Iterator.remove(). Easiest to code, but seems like
> > > a shame.
> > >
> > > None of these options seems terribly appetizing to me, but
> > I think we
> > > should do something.
> > >
> > > Any preferences?
> > >
> > > Thomas.
> > > _______________________________________________
> > > Biojava-l mailing list - Biojava-l at biojava.org
> > > http://biojava.org/mailman/listinfo/biojava-l
> > >
> >
> >
> > --
> > BioJava Consulting LTD - Support and training for BioJava
> > http://www.biojava.co.uk
> >
> >
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> ------------------------------
>
> Message: 4
> Date: Wed, 25 Jun 2003 03:28:28 +0100
> From: David Huen <david.huen at ntlworld.com>
> Subject: Re: [Biojava-l] SAX parser demo
> To: "Russell Smithies" <russell.smithies at xtra.co.nz>,
> "Biojava-L at Biojava. Org" <biojava-l at biojava.org>
> Message-ID: <200306250328.29585.david.huen at ntlworld.com>
> Content-Type: text/plain; charset="windows-1252"
>
> Hi,
> OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml.
> It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> ported to use the BlastXML parser. You will need to do a "cvs update -d"
> to create the new directories for the demos and for the DTD directory.
>
> I have added a facade to the BlastXML parsing framework. The facade is
> called BlastXMLParserFacade and is used identically to the way the existing
> BlastLikeSAXParser is used with blast text output. I think this will make
> it easier for users all round: that both have the same interface. You can
> look in that class to see how the BJ parsing framework is actually set up.
>
> I won't have more time available to work on this for a bit but bug reports
> are welcome for eventual fixes. As previously mentioned, running multiple
> sequence queries on a database with NCBI blast results in the concatenation
> of all the Blast XML outputs resulting in an almighty completely non-XML
> compliant file (multiple <xml> and <DOCTYPE> elements for example).
> Parsing those requires a hack I have previously described but it is ugly,
> ugly, ugly. Maybe the latest NCBI version might have fixed this problem
> but I haven't looked.
>
> Best wishes,
> David Huen
> P.S. It is really really bedtime, guys.....
> P.P.S There is an ugly entity resolver hack I will need to clean up later
> too.
>
> ------------------------------
>
> Message: 5
> Date: Wed, 25 Jun 2003 16:46:05 +1200
> From: "Russell Smithies" <russell.smithies at xtra.co.nz>
> Subject: Re: [Biojava-l] SAX parser demo
> To: <smh1008 at cus.cam.ac.uk>, "Biojava-L at Biojava. Org"
> <biojava-l at biojava.org>
> Message-ID: <001901c33ad4$b08a3070$503c56d2 at lex>
> Content-Type: text/plain; charset="Windows-1252"
>
> Looks good but doesn't do what I need but I don't think it was ever going to
> :-(
>
> The blast XML data has loads of info in it (I guess thats the reason for the
> format) but I want to be able to get at individual tags, not just hits. For
> example, some of the stats data (Statistics_entropy, Statistics_eff-space
> etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
> just hitID and e-value might be useful?
> I guess I'll have to implement some new bits (from
> SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.
>
> any ideas?
>
> thanx
> Russell
>
> ----- Original Message -----
> From: "David Huen" <david.huen at ntlworld.com>
> To: "Russell Smithies" <russell.smithies at xtra.co.nz>; "Biojava-L at Biojava.
> Org" <biojava-l at biojava.org>
> Cc: <jinchen at ufl.edu>
> Sent: Wednesday, June 25, 2003 2:28 PM
> Subject: Re: [Biojava-l] SAX parser demo
>
> > Hi,
> > OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml.
> > It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> > ported to use the BlastXML parser. You will need to do a "cvs update -d"
> > to create the new directories for the demos and for the DTD directory.
> >
> > I have added a facade to the BlastXML parsing framework. The facade is
> > called BlastXMLParserFacade and is used identically to the way the
> existing
> > BlastLikeSAXParser is used with blast text output. I think this will make
> > it easier for users all round: that both have the same interface. You can
> > look in that class to see how the BJ parsing framework is actually set up.
> >
> > I won't have more time available to work on this for a bit but bug reports
> > are welcome for eventual fixes. As previously mentioned, running multiple
> > sequence queries on a database with NCBI blast results in the
> concatenation
> > of all the Blast XML outputs resulting in an almighty completely non-XML
> > compliant file (multiple <xml> and <DOCTYPE> elements for example).
> > Parsing those requires a hack I have previously described but it is ugly,
> > ugly, ugly. Maybe the latest NCBI version might have fixed this problem
> > but I haven't looked.
> >
> > Best wishes,
> > David Huen
> > P.S. It is really really bedtime, guys.....
> > P.P.S There is an ugly entity resolver hack I will need to clean up later
> > too.
> >
>
> ------------------------------
>
> Message: 6
> Date: Wed, 25 Jun 2003 19:06:46 +1200
> From: "Schreiber, Mark" <mark.schreiber at agresearch.co.nz>
> Subject: RE: [Biojava-l] SAX parser demo
> To: "Russell Smithies" <russell.smithies at xtra.co.nz>,
> <smh1008 at cus.cam.ac.uk>, "Biojava-L at Biojava. Org"
> <biojava-l at biojava.org>
> Message-ID:
> <AF026AF0FF4B054590228FD1F1DE5165011BA572 at inbox.agresearch.co.nz>
> Content-Type: text/plain; charset="utf-8"
>
> Hi -
>
> Depends how much you want to bind it to biojava. If you don't need biojava objects just make a SAX parser to listen for the bits you want. If you do want to bind it to biojava objects I would suggest modifying the parser to put the info into an Annotation object.
>
> - Mark
>
>
> -----Original Message-----
> From: Russell Smithies [mailto:russell.smithies at xtra.co.nz]
> Sent: Wed 25/06/2003 4:46 p.m.
> To: smh1008 at cus.cam.ac.uk; Biojava-L at Biojava. Org
> Cc:
> Subject: Re: [Biojava-l] SAX parser demo
>
>
>
> Looks good but doesn't do what I need but I don't think it was ever going to
> :-(
>
> The blast XML data has loads of info in it (I guess thats the reason for the
> format) but I want to be able to get at individual tags, not just hits. For
> example, some of the stats data (Statistics_entropy, Statistics_eff-space
> etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
> just hitID and e-value might be useful?
> I guess I'll have to implement some new bits (from
> SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.
>
> any ideas?
>
> thanx
> Russell
>
> ----- Original Message -----
> From: "David Huen" <david.huen at ntlworld.com>
> To: "Russell Smithies" <russell.smithies at xtra.co.nz>; "Biojava-L at Biojava.
> Org" <biojava-l at biojava.org>
> Cc: <jinchen at ufl.edu>
> Sent: Wednesday, June 25, 2003 2:28 PM
> Subject: Re: [Biojava-l] SAX parser demo
>
>
> > Hi,
> > OK, I have uploaded a demo to CVS. It is at biojava-live/demos/blastxml.
> > It's just a plain ripoff of Mark Schreiber's demo in Biojava In Anger
> > ported to use the BlastXML parser. You will need to do a "cvs update -d"
> > to create the new directories for the demos and for the DTD directory.
> >
> > I have added a facade to the BlastXML parsing framework. The facade is
> > called BlastXMLParserFacade and is used identically to the way the
> existing
> > BlastLikeSAXParser is used with blast text output. I think this will make
> > it easier for users all round: that both have the same interface. You can
> > look in that class to see how the BJ parsing framework is actually set up.
> >
> > I won't have more time available to work on this for a bit but bug reports
> > are welcome for eventual fixes. As previously mentioned, running multiple
> > sequence queries on a database with NCBI blast results in the
> concatenation
> > of all the Blast XML outputs resulting in an almighty completely non-XML
> > compliant file (multiple <xml> and <DOCTYPE> elements for example).
> > Parsing those requires a hack I have previously described but it is ugly,
> > ugly, ugly. Maybe the latest NCBI version might have fixed this problem
> > but I haven't looked.
> >
> > Best wishes,
> > David Huen
> > P.S. It is really really bedtime, guys.....
> > P.P.S There is an ugly entity resolver hack I will need to clean up later
> > too.
> >
>
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
>
> =======================================================================
> Attention: The information contained in this message and/or attachments
> from AgResearch Limited is intended only for the persons or entities
> to which it is addressed and may contain confidential and/or privileged
> material. Any review, retransmission, dissemination or other use of, or
> taking of any action in reliance upon, this information by persons or
> entities other than the intended recipients is prohibited by AgResearch
> Limited. If you have received this message in error, please notify the
> sender immediately.
> =======================================================================
>
> ------------------------------
>
> Message: 7
> Date: Wed, 25 Jun 2003 08:35:00 +0100 (BST)
> From: David Huen <smh1008 at cus.cam.ac.uk>
> Subject: Re: [Biojava-l] SAX parser demo
> To: Russell Smithies <russell.smithies at xtra.co.nz>
> Cc: "Biojava-L at Biojava. Org" <biojava-l at biojava.org>
> Message-ID:
> <Pine.SOL.3.96.1030625081838.25392A-100000 at virgo.cus.cam.ac.uk>
> Content-Type: TEXT/PLAIN; charset=US-ASCII
>
> On Wed, 25 Jun 2003, Russell Smithies wrote:
>
> > Looks good but doesn't do what I need but I don't think it was ever going to
> > :-(
> >
> > The blast XML data has loads of info in it (I guess thats the reason for the
> > format) but I want to be able to get at individual tags, not just hits. For
> > example, some of the stats data (Statistics_entropy, Statistics_eff-space
> > etc.) or other hit data (Hsp_align-len, Hsp_pattern-from etc.) instead of
> > just hitID and e-value might be useful?
> > I guess I'll have to implement some new bits (from
> > SimpleSeqSimilaritySearchSubHit?) but not exactly sure where.
> >
> Ah, OK. I have picked up most but not all the fields.
>
> Hsp_align-len is picked up and placed in an alignmentSize attribute.
>
> The others are not but it should not be difficult to parse and stuff them
> into the SAX output stream. If a suitable fit with the
> BlastLikeDataSetCollection.dtd can be achieved it should be possible to
> map it over readily. If not, we will have to extend that appropriately
> without breakage. However, not all the data can be mapped
> to the SeqSimilarity stuff so you may have to place a listener to handle
> those yourself.
>
> I don't see Hsp_pattern-from in my XML output. Do you have an output
> file with it? This parser was written by reverse engineering the
> semantics from the output ;-). I seem to recall that the semantics of
> orientation was weird.
>
> Regards,
> David
>
> ------------------------------
>
> Message: 8
> Date: Wed, 25 Jun 2003 10:15:55 +0100
> From: Thomas Down <thomas at derkholm.net>
> Subject: Re: [Biojava-l] TranslatedRegion
> To: Matthew Muller <mmuller at nuvelo.com>
> Cc: biojava-l at biojava.org
> Message-ID: <20030625091555.GA14097 at firechild>
> Content-Type: text/plain; charset=us-ascii
>
> Once upon a time, Matthew Muller wrote:
> > I was using org.biojava.bio.seq.genomic.TranslatedRegion to bind a protein sequence to it's mRNA sequence. I see that in BioJava 1.3, the whole package has been dropped.
> >
> > I want an object model that links mRNA, protein translations, and Features of both the mRNA and Protein.
> >
> > I'm guessing that the capability is still there but the classes are too abstract for me to understand. I suspect I need to use a FramedFeature. FramedFeatures implement the concept of a Translated Region but has no place for the Protein Sequence itself.
>
> Ah, sorry about this. We got the impression that nobody was
> using this package, and it wasn't being well-maintained. See:
>
> http://biojava.org/pipermail/biojava-l/2002-November/003292.html
>
> (there were no replies).
>
> The `normal' thing to do if you want the protein sequence is
> something like:
>
> SymbolList prot = RNATools.translate(cdsFeature.getSymbols());
>
> But I assume you're talking about associating protein sequences
> which aren't always going to be 100% identical to the translation
> of the RNA sequence. In this case, the best thing to do might
> be to put the protein sequence as a property in the Annotation
> object of the feature.
>
> Or am I missing something here?
>
> Does anyone else miss the seq.genomic package? Should we be
> reinstating it?
>
> Thomas.
>
> ------------------------------
>
> Message: 9
> Date: Wed, 25 Jun 2003 11:03:18 +0100
> From: David Huen <smh1008 at cus.cam.ac.uk>
> Subject: Re: [Biojava-l] SAX parser demo
> To: "Russell Smithies" <russell.smithies at xtra.co.nz>,
> biojava-l at biojava.org
> Message-ID: <200306251103.18286.smh1008 at cus.cam.ac.uk>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Wednesday 25 Jun 2003 9:24 am, Russell Smithies wrote:
> > Hi David,
> > Heres the output from a small -m7 blast.
> > Ignore the low-scoring hits as I've been toying with a blast db made of
> > only binding sites :-)
>
> OK, I've had a look thru' but I don't know what Hsp_pattern-from and
> Hsp_pattern-to do, they seem to be always zero in your output ;-)
>
> As for the statistics at the end of the run, the current DTD has not got an
> element for them. The Summary element seems to have stuff for a HitSummary
> but not for a run summary. See dtd/BlastLikeDataSetCollection.dtd for
> details of what we have. We could define an additional element to pick up
> that stuff into but for that it'll be better to make a proposal to the
> mail-list and get some feedback on that.
>
> >
> > How do I create a listener to parse tags into the SAX output stream? I
> > think most of the useful ones are in the .dtd but can't find where to
> > implement them and haven't found the source file where the other
> > listeners are.
> >
>
> The blast xml parser stuff is in src/org/biojava/bio/program/sax/blastxml.
> All that stuff is written to the StAX API. Basically, each element handler
> delegates child elements to other handlers all of which call a listener
> written to an API for seq similarity searches. The
> BlastLikeDataSetCollection stuff is in src/org/biojava/bio/program/ssbind.
>
> Regards,
> David
>
> ------------------------------
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l at biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>
> End of Biojava-l Digest, Vol 6, Issue 26
> ****************************************
More information about the Biojava-l
mailing list