[Bioperl-l] How to get sequence of joins
Keith James
kdj@sanger.ac.uk
20 Oct 2000 11:44:38 +0100
>>>>> "Henrik" == Henrik Seidel <Henrik.Seidel@schering.de> writes:
Henrik> Hi all, I could not figure out a simple way to get the
Henrik> sequence of a CDS which looks like
Henrik> CDS complement(join(1..10,20..30)
Henrik> The sequence member of the top level CDS_span object would
Henrik> be 1..30 and not the join. Do I have to cycle through all
Henrik> the subfeatures of the top level CDS_span and concatenate
Henrik> the sequences of all subfeatures which are a CDS_span as
Henrik> well, or is there an easier way?
I think that you have to do as you suggest here i.e. take the
PrimarySeq of each sub-feature, get the sequence string from that and
then concatenate those.
Henrik> Even if I concatenate them, how do I know that I have to
Henrik> take the complement? In the top level CDS_span I did not
Henrik> find any field containing or any member function returning
Henrik> an indication that this is supposed to be the
Henrik> complement. Is there a function returning the correct
Henrik> sequence (i.e., after joining and taking the complement)
Henrik> of a top level CDS_span?
There is none at the moment. You'll need to check the strand, both for
consistency - all one strand, or the other - unless you are expecting
some of those unusual features with occur on neither/both, and then
reverse the subfeature order for reverse strand joins before
concatenating.
When/if I get round to merging my modules I intend to move these
functions across (unless someone else gets there first).
Henrik> Or is there a function returning the complete original
Henrik> entry as it was in the source file (i.e.,
Henrik> "complement(join(1..10,20..30)")?
This I don't know.
cheers,
--
-= Keith James - kdj@sanger.ac.uk - http://www.sanger.ac.uk/Users/kdj =-
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA