[Biojava-l] Getting a part of a sequence

Gabrielle Doan gabrielle_doan at gmx.net
Tue Oct 7 14:26:44 UTC 2008


Hi all,
I have a BioSQL database which contains all human chromosomes. My 
intention is to get the information about a particular gene. How can I 
get a part of a particular chromosome with all associated features? At 
the moment I use following code to create my new sequence:

<code>
RichSequence subSeq = RichSequence.Tools.subSequence(parent,
	position[0], position[1], ns, geneName, parent.getAccession(),
	parent.getIdentifier(), parent.getVersion() + 1,
	(Double) (parent.getVersion() + 1.0));
<\code>

Here is the part how I get the parent sequence:
<code>
	public static RichSequence getChromosome(String chrNo) {
		Transaction tx = session.beginTransaction();
		RichSequence ret = null;

		String query;

		try {
			if (chrNo.equals("MT")) {
				query = "from BioEntry as be where be.description like '%:num%'";
				query = query.replaceAll(":num", "mitochondrion");
			} else {
				query = "from BioEntry as be where be.description like '%hromosome 
:num%'";
				query = query.replaceAll(":num", chrNo);
			}

			Query q = session.createQuery(query);

			ret = (RichSequence) q.list().get(0);
			tx.commit();
		} catch (Exception e) {
			tx.rollback();
			e.printStackTrace();
		}
		return ret;
	}
<\code>

I always have to load the whole chromsome to get a part of it, so it 
takes very long time and I get a lot of unused information (waste of 
memory). I also tried to use <code>ThinRichSequence<\code> instead of 
<code>RichSequence<\code>, but thereby I didn't notice any difference.
Can you give me a hint how to accelerate the code?
I am grateful for any hits.

cheers,
Gabrielle



More information about the Biojava-l mailing list