[Bioperl-l] Changes to Bio::SeqI broke Bio::Graphics
Matthew Pocock
matthew_pocock@yahoo.co.uk
Tue, 12 Nov 2002 21:01:50 +0000
Lincoln Stein wrote:
> This allows an
> entry to be a feature in a larger virtual sequence, such as a genome
> assembly. I don't see why we persist in thinking in this flat-file EMBL
> entry way.
>
> Lincoln
>
In BioJava we have a feature interface ComponentFeature with the two
important methods getComponentLocation() and getComponentSequence().
This indicates in an assembled sequence where you are to glue in bits of
other sequences. The ComponentFeatures location is where it is in the
assembly and the getComponentLocation is what region of
getComponentSeqeunce to insert there. This has many advantages over
making sequences features. Firstly, and most importantly, it allows us
to project a single sequence (or multiple portions of a single sequence)
into multiple assemblies (or different places in the same assembly). For
example, we can take the same clones and project them into multiple
versions of the human golden path. Translocations can trivialy be
represented by building one assembled sequence for the normal and
translocated chromosome, and the same underlying sequences can be used,
providing a benefit in both memory and integrity of the annotations. It
has many other benefits from the object modelling view of things,
allowing us to implement stuff via lazy proxies (portions of an assembly
don't need to be loaded untill you try to fetch data from them).
I would caution against rolling the location of a sequene in a larger
assembly ino the sequence itself. At the very least, you are going to
need to publish a start and stop coordinate within the sequence that
gets projected. This whole line of modelling in our experience causes
problems rather than providing solutions.
Matthew
__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com