[Bioperl-l] COG software?

Rick Westerman westerman@purdue.edu
Mon, 21 Jan 2002 14:52:12 -0500


Jason Stajich wrote:
>You can download the COG proteins from ncbi in the dir /pub/COG/COGs.  So
>you can get all the protein sequences that make up the COG - just
>blastx/fastxy your unfinished genomic sequence against these.

      The "just" part is a bit of a simplification. For performance reasons 
the COGs need to be bundled into  one db and my sequences into one 
db/dataset.  After that comes the problem of parsing which of my 
sequence(s) matched which COG(s) how many times and into what functional 
groups.  All in an automated manner.

      One can imagine a program with a nice set of command line parameters 
to do limits and sorts by clades, functions, COG description, etc.

      Jason did give some nice code on how to bundle the COGs into a db and 
other advice which I appreciate.   If I end up 'rolling my own' (i.e., 
writing a program -- I am about 1/2 done) I will certainly distribute it to 
the group.



-- Rick

Rick Westerman
westerman@purdue.edu

Phone: (765) 494-0505                         FAX: (765) 496-7255
S049 WSLR bldg. Purdue Univ. W. Lafayette, IN 47907-1153

Bioinformatics specialist at the Genomics Initiative.
Part time system manager of Biochemistry department.

href="http://www.biochem.purdue.edu/~westerm"