[Biojava-l] locating genes in a genomic sequence

Schreiber, Mark mark.schreiber@agresearch.co.nz
Wed, 8 Jan 2003 09:28:37 +1300


Hi -

If you already have the SimpleGene features constructed these will
contain a Location object. However, I think you are saying how can I
find a subsequence in my Genomic sequence and locate the gene that way?

To rapidly find exact matches you can use the biojava
KnuthMorrisPrattSearch object from the org.biojava.bio.search package.
It contains a main method that demonstrates it's use. This is a very
efficient algorithm for finding exact matches.

Note: if the genome is larger than 64kb you will not be able to dump it
as a String as that is the maximum String length. You could dump it as a
char[].

- Mark


> -----Original Message-----
> From: Karin Lagesen [mailto:karin.lagesen@labmed.uio.no] 
> Sent: Tuesday, 7 January 2003 10:53 p.m.
> To: biojava-l@biojava.org
> Subject: [Biojava-l] locating genes in a genomic sequence
> 
> 
> Hi!
> 
> I am trying to build a small program for finding intergenic 
> areas. This I am planning to do by locating all mRNA's in a 
> genome and outputting all the areas inbetween. Biojava seems 
> to be able to help with most of my tasks. However, I have a 
> few questions. From what I have understood I can have the 
> genomic sequence as a SimpleSequence with all of the genes as 
> SimpleGene's attached via a FeatureHolder to the genomic 
> sequence. However, I have not figured out a smart way of 
> finding the location of each of the genes in the genomic 
> sequence. Since genomic sequences can be large, I hoped to 
> avoid having to dump the sequence in a string and use 
> indexOf(gene sequence) to find the position. Is there 
> something I am missing here, or have I just misunderstood all of this?
> 
> Thankyou in advance for your help. 
> 
> Karin
> -- 
> Karin Lagesen, PhD student
> karin.lagesen@labmed.uio.no 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org 
> http://biojava.org/mailman/listinfo/biojava-l
> 
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================