[Biojava-l] Getting a part of a sequence

Richard Holland holland at eaglegenomics.com
Tue Oct 14 15:23:10 UTC 2008


Something's broken! At least from your stack trace I can see exactly what's
going on. The set of locations is being loaded for the feature, but
Hibernate is not calling the setMin()/setMax() methods in each location
before inserting them into the set.

When they get added to the set of locations for the feature, they therefore
get added with null for min and max. At any point when these locations are
used, for instance when they are merged by the feature location setter, or
anywhere else, you'll get NullPointerExceptions.

This is despite the fact that the HBM XML files are explicitly telling it
_not_ to lazy-load them. Also this only happens when loading Features, and
not when loading Sequence objects.

I honestly don't know!

What I suggest is that you create a temporary database with only one record
in it, and run your test program against that to see what happens. If it
still breaks, raise a bug on BugZilla and post the Genbank dump of the
database to BugZilla along with your program code and the full stacktrace.
Someone with a bit more Hibernate knowledge than me might then be able to
help out.

cheers,
Richard


2008/10/14 Gabrielle Doan <gabrielle_doan at gmx.net>

> Hi Richard,
> I have checked out the latest source and tried my code again. It still
> didn't work and I received following new errors:
>
> <message>
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.reflect.InvocationTargetException
>        at
> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.processFeatureFilter(BioSQLRichSequenceDB.java:143)
>        at
> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.filter(BioSQLRichSequenceDB.java:151)
>        at
> org.sequence_viewer.db.HBioSQLDB.filterFeature(HBioSQLDB.java:612)
>        at org.sequence_viewer.db.AbfragenTest.main(AbfragenTest.java:56)
> Caused by: java.lang.reflect.InvocationTargetException
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.processFeatureFilter(BioSQLRichSequenceDB.java:138)
>        ... 3 more
> Caused by: org.hibernate.PropertyAccessException: Exception occurred inside
> setter of org.biojavax.bio.seq.SimpleRichFeature.locationSet
>        at
> org.hibernate.property.BasicPropertyAccessor$BasicSetter.set(BasicPropertyAccessor.java:65)
>        at
> org.hibernate.tuple.entity.AbstractEntityTuplizer.setPropertyValues(AbstractEntityTuplizer.java:337)
>        at
> org.hibernate.tuple.entity.PojoEntityTuplizer.setPropertyValues(PojoEntityTuplizer.java:200)
>        at
> org.hibernate.persister.entity.AbstractEntityPersister.setPropertyValues(AbstractEntityPersister.java:3571)
>        at
> org.hibernate.engine.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:133)
>        at
> org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:854)
>        at org.hibernate.loader.Loader.doQuery(Loader.java:729)
>        at
> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
>        at org.hibernate.loader.Loader.doList(Loader.java:2213)
>        at
> org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2104)
>        at org.hibernate.loader.Loader.list(Loader.java:2099)
>        at
> org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:94)
>        at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1569)
>        at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:283)
>        ... 8 more
> Caused by: java.lang.reflect.InvocationTargetException
>        at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.hibernate.property.BasicPropertyAccessor$BasicSetter.set(BasicPropertyAccessor.java:42)
>        ... 21 more
> Caused by: java.lang.NullPointerException
>        at
> org.biojavax.bio.seq.PositionResolver$AverageResolver.getMin(PositionResolver.java:103)
>        at
> org.biojavax.bio.seq.SimpleRichLocation.getMin(SimpleRichLocation.java:323)
>        at
> org.biojavax.bio.seq.SimpleRichLocation.overlaps(SimpleRichLocation.java:451)
>        at
> org.biojavax.bio.seq.SimpleRichLocation.union(SimpleRichLocation.java:469)
>        at
> org.biojavax.bio.seq.RichLocation$Tools.merge(RichLocation.java:363)
>        at
> org.biojavax.bio.seq.SimpleRichFeature.setLocationSet(SimpleRichFeature.java:181)
>        ... 25 more
> <\message>
>
> I think <code> BioSQLFeatureFilter.OverlapsRichLocation(rl) <\code> causes
> the problem I have. Can you help me to solve this problem?
>
> I'm grateful for any hints.
> cheers,
>
> Gabrielle
>
>
>
> Richard Holland schrieb:
>
>> This looks like a bug in BJX. I have just committed a fix that I think
>> will
>> fix it to the head of subversion. Can you check out the latest source,
>> compile it, and try your program again?
>>
>> cheers,
>> Richard
>>
>> 2008/10/9 Gabrielle Doan <gabrielle_doan at gmx.net>
>>
>>  Hi Richard,
>>>
>>> thanks a lot for your mail. I have successfully retrieved the subsequence
>>> of a sequence as a String. And now I try to get the features for a
>>> particular range with following code:
>>>
>>> <code>
>>>       public FeatureHolder filterFeature(String name, int startpos, int
>>> endpos) {
>>>               RichLocation rl = new SimpleRichLocation(new
>>> SimplePosition(startpos),
>>>                               new SimplePosition(endpos), 0);
>>>               BioSQLFeatureFilter filter = new BioSQLFeatureFilter.And(
>>>                               new
>>> BioSQLFeatureFilter.BySequenceName(name),
>>>                               new
>>> BioSQLFeatureFilter.OverlapsRichLocation(rl));
>>>               return filter(filter);
>>>       }
>>> <\code>
>>>
>>> Fortunately I received these errors:
>>> <message>
>>> Exception in thread "main" java.lang.RuntimeException:
>>> java.lang.reflect.InvocationTargetException
>>>       at
>>>
>>> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.processFeatureFilter(BioSQLRichSequenceDB.java:143)
>>>       at
>>>
>>> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.filter(BioSQLRichSequenceDB.java:151)
>>>       at
>>> org.sequence_viewer.db.HBioSQLDB.filterFeature(HBioSQLDB.java:599)
>>>       at org.sequence_viewer.db.AbfragenTest.main(AbfragenTest.java:56)
>>> Caused by: java.lang.reflect.InvocationTargetException
>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>       at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>       at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>       at java.lang.reflect.Method.invoke(Method.java:597)
>>>       at
>>>
>>> org.biojavax.bio.db.biosql.BioSQLRichSequenceDB.processFeatureFilter(BioSQLRichSequenceDB.java:138)
>>>       ... 3 more
>>> Caused by: org.hibernate.PropertyAccessException: Exception occurred
>>> inside
>>> setter of org.biojavax.bio.seq.SimpleRichFeature.locationSet
>>>       at
>>>
>>> org.hibernate.property.BasicPropertyAccessor$BasicSetter.set(BasicPropertyAccessor.java:65)
>>>       at
>>>
>>> org.hibernate.tuple.entity.AbstractEntityTuplizer.setPropertyValues(AbstractEntityTuplizer.java:337)
>>>       at
>>>
>>> org.hibernate.tuple.entity.PojoEntityTuplizer.setPropertyValues(PojoEntityTuplizer.java:200)
>>>       at
>>>
>>> org.hibernate.persister.entity.AbstractEntityPersister.setPropertyValues(AbstractEntityPersister.java:3571)
>>>       at
>>> org.hibernate.engine.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:133)
>>>       at
>>>
>>> org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:854)
>>>       at org.hibernate.loader.Loader.doQuery(Loader.java:729)
>>>       at
>>>
>>> org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:236)
>>>       at org.hibernate.loader.Loader.doList(Loader.java:2213)
>>>       at
>>> org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2104)
>>>       at org.hibernate.loader.Loader.list(Loader.java:2099)
>>>       at
>>> org.hibernate.loader.criteria.CriteriaLoader.list(CriteriaLoader.java:94)
>>>       at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1569)
>>>       at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:283)
>>>       ... 8 more
>>> Caused by: java.lang.reflect.InvocationTargetException
>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>       at
>>>
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>       at
>>>
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>       at java.lang.reflect.Method.invoke(Method.java:597)
>>>       at
>>>
>>> org.hibernate.property.BasicPropertyAccessor$BasicSetter.set(BasicPropertyAccessor.java:42)
>>>       ... 21 more
>>> Caused by: java.lang.NullPointerException
>>>       at
>>>
>>> org.biojavax.bio.seq.PositionResolver$AverageResolver.getMin(PositionResolver.java:103)
>>>       at
>>>
>>> org.biojavax.bio.seq.SimpleRichLocation.getMin(SimpleRichLocation.java:323)
>>>       at
>>>
>>> org.biojavax.bio.seq.SimpleRichLocation.overlaps(SimpleRichLocation.java:451)
>>>       at
>>>
>>> org.biojavax.bio.seq.SimpleRichLocation.union(SimpleRichLocation.java:469)
>>>       at
>>> org.biojavax.bio.seq.RichLocation$Tools.merge(RichLocation.java:363)
>>>       at
>>>
>>> org.biojavax.bio.seq.SimpleRichFeature.setLocationSet(SimpleRichFeature.java:181)
>>>       ... 26 more
>>> <\message>
>>>
>>> Why do I get these errors?
>>> BioSQLFeatureFilter.BySequenceName(name) needs a seqName as parameter.
>>> How
>>> can I find out the sequence name? Is it the value "name" in the table
>>> "Bioentry"? As the build-in subSequence method takes a long time I intend
>>> to
>>> get the subsequence as a String by myself and add the features to it.
>>> What
>>> do you think about this?
>>>
>>> I'm grateful for any hints.
>>> cheers,
>>>
>>> Gabrielle
>>>
>>>
>>>
>>> Richard Holland schrieb:
>>>
>>>  Hello.
>>>
>>>> Your code is pretty good already - but you're right, it will load the
>>>> whole chromosome into memory before you can chop out the interesting
>>>> bit you actually need.
>>>>
>>>> As you observed, by using ThinRichSequence in your query it will load
>>>> only the initial shell of a sequence object to start with, but the
>>>> moment you try and sub-sequence it, it will immediately load the whole
>>>> sequence data into memory in order to perform the operation.
>>>>
>>>> If you only want the sequence data, as a string, you can do this by
>>>> specifying the sequence attribute in the query and bypassing the
>>>> sequence object entirely:
>>>>
>>>>  select rs.stringSequence from Sequence as rs where rs.description
>>>> like '%hromosome :num%
>>>>
>>>> This will return a String instead of a RichSequence object. You can
>>>> use HQL operators to perform substrings etc. on the string inside the
>>>> query itself - see
>>>> http://docs.huihoo.com/hibernate/hibernate-reference-3.2.1/queryhql.html
>>>> , particularly section 14.9.
>>>>
>>>> If you only want the features, you can do this by using the
>>>> BioSQLFeatureFilter technique. In particular you will want the
>>>> BySequenceName filter, the And filter, and the OverlapsRichLocation
>>>> filter. You construct a filter then pass it to the filter() method in
>>>> BioSQLRichSequenceDB. The database will return to you all the
>>>> RichFeature objects that match your criteria. Note that it searches
>>>> the whole database so you really must use a BySequenceName filter at
>>>> the very least in order to make the results useful!
>>>>
>>>> However, you can't use HQL to construct a complete slice of a sequence
>>>> directly in the database before returning it to the program for use as
>>>> a ready-made RichSequence object. This would require Hibernate to know
>>>> what a BioJava sub-sequence object is and how it behaves in relation
>>>> to an 'unsliced' one, which is beyond the scope of it's job as a
>>>> persistence framework.
>>>>
>>>> cheers,
>>>> Richard
>>>>
>>>>
>>>>
>>>> 2008/10/7 Gabrielle Doan <gabrielle_doan at gmx.net>:
>>>>
>>>>  Hi all,
>>>>> I have a BioSQL database which contains all human chromosomes. My
>>>>> intention
>>>>> is to get the information about a particular gene. How can I get a part
>>>>> of a
>>>>> particular chromosome with all associated features? At the moment I use
>>>>> following code to create my new sequence:
>>>>>
>>>>> <code>
>>>>> RichSequence subSeq = RichSequence.Tools.subSequence(parent,
>>>>>      position[0], position[1], ns, geneName, parent.getAccession(),
>>>>>      parent.getIdentifier(), parent.getVersion() + 1,
>>>>>      (Double) (parent.getVersion() + 1.0));
>>>>> <\code>
>>>>>
>>>>> Here is the part how I get the parent sequence:
>>>>> <code>
>>>>>      public static RichSequence getChromosome(String chrNo) {
>>>>>              Transaction tx = session.beginTransaction();
>>>>>              RichSequence ret = null;
>>>>>
>>>>>              String query;
>>>>>
>>>>>              try {
>>>>>                      if (chrNo.equals("MT")) {
>>>>>                              query = "from BioEntry as be where
>>>>> be.description like '%:num%'";
>>>>>                              query = query.replaceAll(":num",
>>>>> "mitochondrion");
>>>>>                      } else {
>>>>>                              query = "from BioEntry as be where
>>>>> be.description like '%hromosome :num%'";
>>>>>                              query = query.replaceAll(":num", chrNo);
>>>>>                      }
>>>>>
>>>>>                      Query q = session.createQuery(query);
>>>>>
>>>>>                      ret = (RichSequence) q.list().get(0);
>>>>>                      tx.commit();
>>>>>              } catch (Exception e) {
>>>>>                      tx.rollback();
>>>>>                      e.printStackTrace();
>>>>>              }
>>>>>              return ret;
>>>>>      }
>>>>> <\code>
>>>>>
>>>>> I always have to load the whole chromsome to get a part of it, so it
>>>>> takes
>>>>> very long time and I get a lot of unused information (waste of memory).
>>>>> I
>>>>> also tried to use <code>ThinRichSequence<\code> instead of
>>>>> <code>RichSequence<\code>, but thereby I didn't notice any difference.
>>>>> Can you give me a hint how to accelerate the code?
>>>>> I am grateful for any hits.
>>>>>
>>>>> cheers,
>>>>> Gabrielle
>>>>> _______________________________________________
>>>>> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
>>>>> http://lists.open-bio.org/mailman/listinfo/biojava-l
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>
>>
>


-- 
Richard Holland, BSc MBCS
Finance Director, Eagle Genomics Ltd
M: +44 7500 438846 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/



More information about the Biojava-l mailing list