[Biojava-l] GFF & feature creation
Peter Rice
pmr@sanger.ac.uk
Tue, 22 Feb 2000 09:04:12 GMT
Matthew,
We are going through feature handling for EMBOSS too. Internally,
we are keeping somethign similar to GFF but it raised some issues.
Proteins we will treat the same as DNA, but ignore the strand and
frame fields in GFF.
Joins across sequences are a problem. For example, the following EMBL
entry where all except one of the the exons (and flanking sequence)
are in separate entries.
ID AB001103 standard; DNA; HUM; 1329 BP.
XX
AC AB001103;
XX
SV AB001103.1
XX
DT 21-AUG-1998 (Rel. 56, Created)
DT 20-JAN-1999 (Rel. 58, Last updated, Version 3)
XX
DE Homo sapiens gene for H-cadherin, exon 14 and complete cds.
XX
KW H-cadherin.
XX
OS Homo sapiens (human)
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia; Eutheria;
OC Primates; Catarrhini; Hominidae; Homo.
XX
RN [1]
RP 1-1329
RA Horii A.;
RT ;
RL Submitted (18-FEB-1997) to the EMBL/GenBank/DDBJ databases.
RL Akira Horii, Tohoku University School of Medicine, Department of Molecular
RL Pathology; 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
RL (E-mail:horii@mail.cc.tohoku.ac.jp, Tel:81-22-717-8042, Fax:81-22-717-8047)
XX
RN [2]
RA Sato M., Mori Y., Sakurada A., Fujimura S., Horii A.;
RT "The H-cadherin (CDH13) gene is inactivated in human lung cancer";
RL Hum. Genet. 103:96-101(1998).
XX
DR SWISS-PROT; P55290; CADD_HUMAN.
XX
CC Sequence updated (14-Aug-1998)
XX
FH Key Location/Qualifiers
FH
FT CDS join(AB001090.1:1669..1713,AB001091.1:85..196,
FT AB001092.1:40..248,AB001093.1:96..212,AB001094.1:71..223,
FT AB001095.1:87..231,AB001096.1:33..211,AB001097.1:35..175,
FT AB001098.1:213..395,AB001099.1:56..309,AB001100.1:54..196,
FT AB001101.1:171..404,AB001102.1:160..378,210..217)
FT /codon_start=1
FT /db_xref="SWISS-PROT:P55290"
FT /product="H-cadherin"
FT /protein_id="BAA32411.1"
--
----------------------------------------------------------------------
Peter Rice | Informatics Division, The Sanger Centre,
E-mail: pmr@sanger.ac.uk | Wellcome Trust Genome Campus,
Tel: (44) 1223 494967 | Hinxton, Cambridge, CB10 1SA, England
Fax: (44) 1223 494919 | URL: http://www.sanger.ac.uk/Users/pmr/