[BioSQL-l] GO dbxrefs in swissprot
Andreas Henschel
henschel at mpi-cbg.de
Fri Jul 2 06:16:06 EDT 2004
Hi Hilmar,
Thanks for your reply. I was wondering if it is due to my patched
bioperl 1.2.1?
Hilmar Lapp wrote:
> When you say the GO dbxrefs did not appear, how do you mean? Are you
> referring to dbxrefs present in the source file but absent as
> association rows in bioentry_dbxref?
>
Yes!
> If you have a swissprot entry that has GO dbxrefs in the source file
> but fails to have those associated in bioentry_dbxref, check whether
> the Bio::Seq object that's coming from the parser has them as
> annotation. It would sound strange if some entries get the
> associations whereas others don't.
>
Ok, here is what I did: I modified load_seqdatabase.pl to print out the
annotions. I ran it, comparing two small flatfiles, both containing GO
annotations (according to flatfile and swissprot website).
For the first, the parser detected no GO annotation, where as the latter
got it:
$prompt> perl load_seqdatabase.pl --host dbserver --dbuser ah --dbname
bioseqdb --namespace swissprot --format swiss --lookup --remove
--testonly P53396.dat
Annotation dblink stringified value Direct database link to X64330 in
database EMBL
Annotation dblink stringified value Direct database link to U18197 in
database EMBL
Annotation dblink stringified value Direct database link to BC006195 in
database EMBL
Annotation dblink stringified value Direct database link to S21173 in
database PIR
Annotation dblink stringified value Direct database link to P07459 in
database HSSP
Annotation dblink stringified value Direct database link to HGNC:115 in
database Genew
Annotation dblink stringified value Direct database link to P53396 in
database GK
Annotation dblink stringified value Direct database link to 108728 in
database MIM
Annotation dblink stringified value Direct database link to IPR002020 in
database InterPro
Annotation dblink stringified value Direct database link to IPR003781 in
database InterPro
Annotation dblink stringified value Direct database link to IPR005811 in
database InterPro
Annotation dblink stringified value Direct database link to IPR005810 in
database InterPro
Annotation dblink stringified value Direct database link to IPR005809 in
database InterPro
Annotation dblink stringified value Direct database link to PF02629 in
database Pfam
Annotation dblink stringified value Direct database link to PF00549 in
database Pfam
Annotation dblink stringified value Direct database link to PS01216 in
database PROSITE
Annotation dblink stringified value Direct database link to PS00399 in
database PROSITE
Annotation dblink stringified value Direct database link to PS01217 in
database PROSITE
$prompt> perl load_seqdatabase.pl --host dbserver --dbuser ah --dbname
bioseqdb --namespace swissprot --format swiss --lookup --remove
--testonly Q15777.dat
Loading Q15777.dat ...
Annotation dblink stringified value Direct database link to U57911 in
database EMBL
Annotation dblink stringified value Direct database link to BC031582 in
database EMBL
Annotation dblink stringified value Direct database link to HGNC:1180 in
database Genew
Annotation dblink stringified value Direct database link to 600911 in
database MIM
Annotation dblink stringified value Direct database link to GO:0007399
in database GO
Annotation dblink stringified value Direct database link to IPR004843 in
database InterPro
Annotation dblink stringified value Direct database link to PF00149 in
database Pfam
The corresponding DR entries in the two flat files are:
P53396.dat:
DR EMBL; X64330; CAA45614.1; -.
DR EMBL; U18197; AAB60340.1; -.
DR EMBL; BC006195; AAH06195.1; -.
DR PIR; S21173; S21173.
DR HSSP; P07459; 1JKJ.
DR Genew; HGNC:115; ACLY.
DR GK; P53396; -.
DR MIM; 108728; -.
DR GO; GO:0009346; C:citrate lyase complex; TAS.
DR GO; GO:0003878; F:ATP citrate synthase activity; TAS.
DR GO; GO:0006200; P:ATP catabolism; TAS.
DR GO; GO:0006101; P:citrate metabolism; TAS.
DR GO; GO:0015936; P:coenzyme A metabolism; TAS.
DR InterPro; IPR002020; Citrate_synth.
DR InterPro; IPR003781; CoA_binding.
DR InterPro; IPR005811; CoA_ligase.
DR InterPro; IPR005810; CoA_lig_alpha.
DR InterPro; IPR005809; CoA_lig_beta.
DR Pfam; PF02629; CoA_binding; 1.
DR Pfam; PF00549; Ligase_CoA; 1.
DR PROSITE; PS01216; SUCCINYL_COA_LIG_1; 1.
DR PROSITE; PS00399; SUCCINYL_COA_LIG_2; 1.
DR PROSITE; PS01217; SUCCINYL_COA_LIG_3; 1.
Q15777.dat:
DR EMBL; U57911; AAC50564.1; -.
DR EMBL; BC031582; AAH31582.1; -.
DR Genew; HGNC:1180; C11orf8.
DR MIM; 600911; -.
DR GO; GO:0007399; P:neurogenesis; TAS.
DR InterPro; IPR004843; M-ppestrase.
DR Pfam; PF00149; Metallophos; 1.
Cheers
Andreas
More information about the BioSQL-l
mailing list