[Open-bio-l] [EMBOSS] Common Sample Data Collection, was: SCF files (Staden)

Hamish McWilliam hamish.mcwilliam at bioinfo-user.org.uk
Thu Dec 15 18:01:40 UTC 2011


Hi Chris,

> That might be the best source to pull from.  Does it archive old file examples (such as older SwissProt/GenBank/EMBL)?

EDAM itself does not store entry data, and at the moment it does not
describe the changes to formats over time, although I'm sure this
could be added along with links to sample entries in the various data
archives.

If you only need a few sample entries, see the appropriate database archive:

- EMBL-Bank Sequence Version Archive (EMBL-SVA):
http://www.ebi.ac.uk/cgi-bin/sva/sva.pl.
E.g. http://www.ebi.ac.uk/cgi-bin/sva/sva.pl/?query=V00077&search=Go
- UniProtKB Sequence/Annotation Version Archive (UniSave):
http://www.ebi.ac.uk/uniprot/unisave/
E.g. http://www.ebi.ac.uk/uniprot/unisave/?query=P00002&search=Go
- NCBI Entrez Revision History.
E.g. http://www.ncbi.nlm.nih.gov/nuccore/V00077?report=girevhist

If you need more entries...

For Swiss-PROT and UniProtKB old versions of the data are available on
the FTP sites, for example from EMBL-EBI:
- ftp://ftp.ebi.ac.uk/pub/databases/uniprot/previous_releases/
- ftp://ftp.ebi.ac.uk/pub/databases/swissprot/sw_old_releases/

For GenBank, Don Gilbert collected various old releases a while back
(http://www.bio.net/bionet/mm/genbankb/2006-October/000251.html),
these are available via the BioMirrors (http://www.bio-mirror.net/).
NCBI may also be able to provide old releases on request.

For EMBL-Bank old releases can be made available on request, contact
ENA (http://www.ebi.ac.uk/ena/about/contact) for more information.

All the best,

Hamish

>
> chris
>
> On Nov 30, 2011, at 8:49 AM, Peter Cock wrote:
>
>> I just checked with Jon and he was happy to forward this back to
>> the list, and also added a couple of URLs that I'd asked about:
>>
>> http://bioportal.bioontology.org/ontologies/44600
>> http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EDAM
>>
>> Peter
>>
>> On Wed, Nov 30, 2011 at 11:14 AM, Jon Ison <jison at ebi.ac.uk> wrote:
>>> Hi Peter (and Peter)
>>>
>>> Just a quick note to say that all (well, nearly all) common bioinformatics data formats are
>>> catalogued in the EDAM ontology:
>>>
>>> http://sourceforge.net/projects/edamontology/files
>>> http://edamontology.sourceforge.net/
>>>
>>> OK - there's bound to be some we've missed :)
>>>
>>> Anyhow, I thought it might help to structure any effort to document data formats (an effort which
>>> I wholeheartedly approve of by the way).  One thing I'd like to add to the EDAM "format"
>>> definitions is a link to the format specification, or failing that, an example.
>>>
>>> Cheers both
>>>
>>> Jon




More information about the Open-Bio-l mailing list