[Bioperl-l] RNA folding
Chris Fields
cjfields at uiuc.edu
Wed Feb 7 22:15:44 UTC 2007
On Feb 7, 2007, at 12:56 PM, Caroline Johnston wrote:
> Thanks Chris.
>
> Storing the interaction data as a hash according to an ontology and
> using
> an extended bracket notation as the string representation seems to
> make
> sense, but I'm still unsure how this is supposed to be
> attached to the Seq objects. You reckon it should be an AnnotationI?
As long as it describes everything in the object and that there is a
reasonable way of textually representing the data, I think you can
attach anything as annotation. A recent example is the addition of
trees as annotation. Also, Annotation can be used to describe
alignments (such as the structure consensus string in Rfam
alignments), or added to SeqFeatures. The class just needs to
implement AnnotatableI.
> I'm not sure I understand the distinction between annotations and
> features. From the docs I got the impression that Features were like
> annotation on bits of sequences and had a reference to the sequence to
> which they belong, whereas annotations don't. If that's the case
> though,
> why would RNA structure be an annotation rather than a feature? If
> not,
> what is the distinction between them? Are the positional Annotation
> subclasses you're developing intended to replace features? Have I
> got the
> wrong end of the stick entirely?
>
> Cheers,
> Cass
The key distinction between seqfeatures and annotations is that
annotations are normally associated with the entire sequence record,
while seqfeatures normally describe a part of the sequence (and thus
have a location on the sequence). There are a few exceptions, but in
general that's that case. The HOWTO gives a bit more background:
http://www.bioperl.org/wiki/HOWTO:Feature-Annotation
Using annotations or seqfeatures in a case like this may be
completely dependent on one's point of view. For instance, one
implementation I had considered was adding an interface to Bio::Seq
which would allow Seq objects to also have Bio::Structure objects/
since my view is that any sequence could (optionally) have a
structure associated with it. However, I reasoned that a sequence
could actually have multiple structures (RNA, ssDNA, and protein can
have several alternative folds or different folding pathways, for
instance). Instead of splitting up each structure into individual
seqfeatures (where each which would have to be tagged with the
relevant structure and score info), I could have one class encompass
all of that data in a reasonable way. Hence I used Annotation.
BTW, this isn't meant to replace features in any way. It would be
primarily used to describe (1) a sequence as a whole, such as a tRNA
sequence, (2) a seqfeature, such as a tRNA, rRNA, riboswitch, etc in
a genome sequence, or (3) a conserved structure in an alignment, such
as Rfam stockholm output.
I'll add that the option of splitting the data into seqfeatures isn't
ruled out. It would be a matter of using a helper method, maybe in
SeqUtils or directly in Annotation::Meta or whatever I end up calling
it. I plan on adding something along those lines at some point.
chris
More information about the Bioperl-l
mailing list