[Biojava-l] Stranded non-contiguous feature looses its strand
in a SubSequence
Matthew Pocock
matthew_pocock@yahoo.co.uk
Wed, 22 May 2002 14:10:43 +0100
Hi Stein Aerts,
When you make a sub sequence, the code takes that as a request to only
tell you about the information in that region. Features that are wholy
contained within it will be projected in with all their properties
intact. Features overlapping but not contained will be replaced by
place-holder features of the type RemoteFeature. RemoteFeature only
publishes very basic information (those fields in the top-level Feature
interface) and then provide a method getRemoteFeature() that will return
the feature they are a slice of (e.g. the original CDS, SOURCE or whatever).
Because of the static typing of Java, it is not possible to make remote
features implement the exact interface of the feature it stands in for.
It would be possible via code-generation, and it is on the list of
things to think about for a future release of BioJava.
Matthew
Stein Aerts wrote:
> Does anyone know why, if you make a subsequence, some of the features that
> are stranded in the original sequence are not stranded anymore in the
> subsequence?
> Would there be a workaround?
>
> Example: (embl file and the code I used is in attachment)
>
> features on the original sequence:
>
> ENSG00000114251: prediction|-|31979, 33725 {([31979,32271]),
> ([33699,33725])}|
> ENSG00000114251: exon|-|[36988,37230]|
> ENSG00000114251: CDS|-|37181, 38568 {([37181,37238]), ([38459,38568])}|
> ENSG00000114251: exon|-|[38459,39310]|
> ENSG00000114251: exon|-|[11752,11941]|
> ENSG00000114251: prediction|+|[68957,69037]|
> ENSG00000114251: prediction|-|[7506,7964]|
> ENSG00000114251: prediction|-|[11751,12043]|
> ENSG00000114251: exon|-|[77834,77918]|
> ENSG00000114251: exon|-|[7510,7965]|
> ENSG00000114251: source|+|[1,82918]|
> ENSG00000114251: exon|-|[32080,32272]|
> ENSG00000114251: prediction|-|[13468,13565]|
> ENSG00000114251: prediction|-|36987, 59254 {([36987,37237]),
> ([38458,38591]), ([42674,42788]), ([43206,43321]), ([45592,45884]),
> ([46589,46685]), ([59243,59254])}|
> ENSG00000114251: prediction|+|73976, 74614 {([73976,74134]),
> ([74468,74614])}|
> ENSG00000114251: exon|-|[36988,37238]|
> ENSG00000114251: exon|-|[5001,7965]|
> ENSG00000114251: CDS|-|7510, 77918 {([7510,7965]), ([11752,11941]),
> ([32080,32272]), ([36988,37230]), ([77834,77918])}|
> ENSG00000114251: exon|-|[11752,12044]|
> ENSG00000114251: prediction|+|16957, 20027 {([16957,17042]),
> ([19913,20027])}|
>
>
> features after making a SubSequence of (77718,79918)
>
> ENSG00000114251: exon|-|[117,201]|
> ENSG00000114251: source|[1,2201]|
> ENSG00000114251: CDS|[117,201]|
>
> Now the last CDS is not stranded anymore. Could the reason be that this CDS
> has a joined location in the original sequence? Because the exon still has
> its strand.
>
> Thanks & bye,
> Stein Aerts.
>
>
>
> Ir Stein Aerts
> Bioinformatics Research
> KULeuven, ESAT-SCD
> Kasteelpark Arenberg 10
> 3001 Heverlee, Belgium
> Tel +32-16-32.17.91
> Fax +32-16-32.19.70
> http://www.esat.kuleuven.ac.be/~dna/BioI/
>
> _______________________________________________
> Biojava-l mailing list - Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>