[Bioperl-l] Insanity of Swissprot parsing, take 2.

Bryan Taylor bryan_w_taylor@yahoo.com
Thu, 22 Mar 2001 20:39:53 -0800 (PST)


--- Ewan Birney <birney@ebi.ac.uk> wrote:

> Latest issue. DR lines in swissprot have an optional count number for
> domains. (of course, DR lines without this information do not have
> anything here).
> Eg:
> 
> DR   Pfam; PF00076; rrm; 2.

To be precise, it's PFAM and PROSITE that use this format. See (3.11.6) in the
SwissProt user manual. For these, the format is

DR   PROSITE | PFAM; ACCESSION_NUMBER; ENTRY_NAME; STATUS.

'STATUS' is one of the following: {n, FALSE_NEG, PARTIAL, UNKNOWN_n}

Other DR databases use the fourth field as well. For EMBL/GenBank/DDBJ the
format is documented:

DR   EMBL; ACCESSION_NUMBER; PROTEIN_ID; STATUS_IDENTIFIER.

Where STATUS_IDENTIFIER is in { "-", JOINED, ALT_INIT, NOT_ANNOTATED_CDS,
ALT_SEQ, ALT_FRAME, ALT_TERM }


__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/