Problem when building database index
Peter Rice
pmr at ebi.ac.uk
Fri Feb 21 09:24:17 UTC 2003
Frankie Cheung wrote:
> Can anyone help me? I don't know how to build database index in EMBOSS
> for the following databanks as I can't find any related information
> from the administration manual:
Some of these are not sequence databases - but I do plan to add nmore
database types in the near future (I already started to extend the
emboss.default syntax for them)
> - PDB
The domainatrix EMBASSY package uses cleaned PDB files. Do you want the
sequences or the structures?
> - PFAM
> - PRODOM
Alignment databases are high on my list of things to do. One (small)
problem is how to name the individual sequences in an alignment, for
example in a PFAM entry.
> - OGLYC
> - BLOCKS
> - TAXONOMY
> - ENZYME
How would you use these in EMBOSS? Or do you just want to use them with
entret (entret is only for sequences, but we can make a general version)
> - dbEST
> - dbSTS
> - dbGSS
These are available as FASTA format files (so you can use dbifasta) - or
you can index the huge flatfile versions with SRS and use the SRSFASTA
access method (which asks SRS to write the sequence in FASTA format, and
then reads it into EMBOSS)
> - dbSNP (XML format now: would EMBOSS consider to allow build XML db
> index in next version ?)
Yes, we will consider XML format databases - but how would you use dbSNP
entries in EMBOSS?
> - UNIGENE
The UNIGENE clusters are available as "almost" fasta files (they have
headers for each cluster). You can index in SRS and use the SRSFASTA
access method. I am looking at skipping the headers and allowing
dbifasta to index these files directly - but there is a choice of
clusters (see pfam above) or single sequences as "entries". In SRS the
UNIGENE data can be nidexed in both ways.
> - LocusLink
How would you use this in EMBOSS?
> - OMIM
> - InterPro
Ah, XML again! In progress.
Does anyone else have requests for databases under EMBOSS to help set
priorities for this work?
Hope this helps,
Peter
More information about the EMBOSS
mailing list