[EMBOSS] FW: Forthcoming change in the EMBL flatfile format
Rodrigo Lopez
rls at ebi.ac.uk
Wed Apr 26 15:46:51 UTC 2006
> -----Original Message-----
> From: owner-seq-dbg at ebi.ac.uk
> [mailto:owner-seq-dbg at ebi.ac.uk] On Behalf Of Carola Kanz
> Sent: 26 April 2006 16:29
> To: seq-dbg at ebi.ac.uk
> Subject: Forthcoming change in the EMBL flatfile format
>
>
> Dear all,
>
> if you are working with the EMBL flatfile format and you are
> not yet aware of the format change we are going to introduce
> with the next release, please have a look at the following
> announcement.
> Carola
>
>
> --------------------------------------------------------------
> -----------
>
> Dear colleagues,
>
> We would like to announce the following important change in
> the EMBL database in June this year.
>
> At the time of release 87 (available from JUN-2006) the
> format of the EMBL flat file will undergo a change: the ID
> line will have a different structure (see below) and the SV
> line will be removed.
>
> The changes affecting the ID line structure are:
>
> * All tokens will be separated by a semicolon.
> * The entry name will not be displayed, in its place
> there will be
> the primary accession number.
> * The sequence version will be indicated.
> * The topology will be a separate token and will be
> indicated for
> both circular and linear molecules.
> * Both the data class and the taxonomic divisions will
> be displayed.
>
> This is an example of the new ID line:
>
> ID CD789012; SV 4; linear; genomic DNA; HTG; MAM; 500 BP.
> (1) (2) (3) (4) (5) (6) (7)
>
>
> The tokens represent:
>
> 1. Primary accession number.
> 2. 'SV' + sequence version number.
> 3. Topology: 'circular' or 'linear'.
> 4. Molecule type.
> 5. Data class (ANN, CON, PAT, EST, GSS, HTC, HTG, MGA, WGS, TPA,
> STS, STD, "normal" entries will have STD for standard).
> 6. Taxonomic division (HUM, MUS, ROD, PRO, MAM, VRT, FUN,
> PLN, ENV,
> INV, SYN, UNC, VRL, PHG)."
> 7. Sequence length + 'BP.'.
>
> The entry name will not be displayed any more in the ID line.
> Since EMBL release 3 (Dec 1983) the stable identifier of an
> entry has been the primary accession number.
>
> A mapping file (entryname to accession number) will be
> provided with the next release for those entries where the
> entryname doesn't coincide with the accession number.
>
> To give users a test dataset, one file with new-style ID
> lines called new_id_line.test.gz was provided together with
> the March release of the EMBL database:
> ftp://ftp.ebi.ac.uk/pub/databases/embl/release/new_id_line.test.gz
>
> Feedback from users is sought; please use the "Contact us"
> link at the bottom of the EBI home page and specify "EMBL" in
> the feedback form.
>
> Note: this information was first made available on our
> "Forthcoming changes" page (
> http://www.ebi.ac.uk/embl/Documentation/forthcomingchanges.htm
> l#0606 ) and in the EMBL database release notes.
>
>
>
>
>
>
More information about the EMBOSS
mailing list