[Biojava-l] Parsing circular sequences
Keith James
kdj@sanger.ac.uk
11 Nov 2002 18:40:22 +0000
>>>>> "Greg" == Cox, Greg <gcox@cle.lionbioscience.com> writes:
Greg> I'm taking a look at a circular genbank sequence of length n
Greg> with a location n^1 on it. I think that what has to happen
Greg> is the size of the sequence and if it is circular have to be
Greg> passed down to EmblLikeLocationParser, which will check each
Greg> location and convert x..y on the sequence to a
Greg> CircularLocation around RangedLocation x..y+n.
Greg> This approach involves changing a lot of method
Greg> signatures, since there are a lot of layers between the
Greg> formatter and the location parser. That's something I'd
Greg> like to avoid, but I've convinced myself it has to be done
Greg> this way since at the least the location parser needs the
Greg> length of the circular sequence. I'd like a sanity check if
Greg> someone has a better idea of how to work through this.
I'm wondering whether for circular sequences you can't create some
sort of proxy location for features which include the origin which get
resolved later?
Given that in a Genbank stream the actual sequence comes last I assume
you're trusting the length in the LOCUS line? I haven't checked how
reliable this is, but I wonder...
Keith
--
- Keith James <kdj@sanger.ac.uk> bioinformatics programming support -
- Pathogen Sequencing Unit, The Wellcome Trust Sanger Institute, UK -