[Bioperl-l] getting proteins matching GO
Nathan (Nat) Goodman
natg at shore.net
Mon Nov 8 10:29:13 EST 2004
Hi Pedro
I'll send the data in a separate message so as not to bombard the mailing
list with a large attachment. If any other readers want the data, let me
know, and I will be happy to forward.
The data will come as a gzipped tar file containing two files created from
the gene_ontology.obo file downloaded from the EBI GO site on Oct 26, 2004.
go_defs.txt
go_descendants.txt
--------------------
go_defs.txt contains one row per GO term. The fields are
go_id -- e.g., GO:0007267
namespace -- one of process, function, component
definition -- long description of the term
name -- short description of the term, e.g. cell-cell signaling
--------------------
go_descendants.txt contains one row for each term and each 'descendant' of
the term, where a descendant is a term which is subordinant to the given
one. Note that the term itself is NOT considered a descendant for this
purpose. The fields are
go_id -- e.g., GO:0007267
namespace -- one of process, function, component
descendant -- the go_id of one descendant
--------------------
These are dumps from a relational database and the format is optimized for
that purpose. To process the data in Perl, I would convert
go_descendants.txt into a form with one line per go_id, with all descendants
scrunched onto the same line separated by comma or space.
Best,
Nat
> -----Original Message-----
> From: Pedro Antonio Reche [mailto:reche at research.dfci.harvard.edu]
> Sent: Monday, November 08, 2004 5:24 AM
> To: natg at shore.net
> Cc: Bioperl
> Subject: Re: [Bioperl-l] getting proteins matching GO
>
> Dear Nathan, thanks a lot for your help. As you mention I
> wish to collect all proteins subordinate to a given term.
> There are several terms I am interested in in retrieving the
> proteins (all related with the immune system) which I have
> not defined entirely. Therefore, I guess that it will be
> easier if you could send me the file you indicated. I have
> just retrieve the gene_association.goa_uniprot.gz.
> Thanks for the tip.
> I am looking forward to hearing from you.
> Best,
More information about the Bioperl-l
mailing list