[Bioperl-l] getting proteins matching GO

Nathan (Nat) Goodman natg at shore.net
Mon Nov 8 10:29:13 EST 2004


Hi Pedro

I'll send the data in a separate message so as not to bombard the mailing
list with a large attachment.  If any other readers want the data, let me
know, and I will be happy to forward. 

The data will come as a gzipped tar file containing two files created from
the gene_ontology.obo file downloaded from the EBI GO site on Oct 26, 2004.

go_defs.txt
go_descendants.txt
--------------------
go_defs.txt contains one row per GO term. The fields are

go_id -- e.g., GO:0007267
namespace -- one of process, function, component
definition -- long description of the term
name -- short description of the term, e.g. cell-cell signaling
--------------------
go_descendants.txt contains one row for each term and each 'descendant' of
the term, where a descendant is a term which is subordinant to the given
one.  Note that the term itself is NOT considered a descendant for this
purpose.  The fields are

go_id -- e.g., GO:0007267
namespace -- one of process, function, component
descendant -- the go_id of one descendant
--------------------

These are dumps from a relational database and the format is optimized for
that purpose.  To process the data in Perl, I would convert
go_descendants.txt into a form with one line per go_id, with all descendants
scrunched onto the same line separated by comma or space. 

Best,
Nat

> -----Original Message-----
> From: Pedro Antonio Reche [mailto:reche at research.dfci.harvard.edu] 
> Sent: Monday, November 08, 2004 5:24 AM
> To: natg at shore.net
> Cc: Bioperl
> Subject: Re: [Bioperl-l] getting proteins matching GO
> 
> Dear Nathan, thanks a lot for your help.  As you mention I 
> wish to collect all proteins subordinate to a given term. 
> There are several terms I am interested in  in retrieving the 
> proteins (all related with the immune system) which I have 
> not defined entirely. Therefore, I guess that it will be 
> easier if you could send me the file you indicated. I have 
> just retrieve the gene_association.goa_uniprot.gz.  
> Thanks for the tip.
> I am looking forward to hearing from you.
> Best,




More information about the Bioperl-l mailing list