[Bioperl-l] COG software?
Rick Westerman
westerman@purdue.edu
Mon, 21 Jan 2002 14:52:12 -0500
Jason Stajich wrote:
>You can download the COG proteins from ncbi in the dir /pub/COG/COGs. So
>you can get all the protein sequences that make up the COG - just
>blastx/fastxy your unfinished genomic sequence against these.
The "just" part is a bit of a simplification. For performance reasons
the COGs need to be bundled into one db and my sequences into one
db/dataset. After that comes the problem of parsing which of my
sequence(s) matched which COG(s) how many times and into what functional
groups. All in an automated manner.
One can imagine a program with a nice set of command line parameters
to do limits and sorts by clades, functions, COG description, etc.
Jason did give some nice code on how to bundle the COGs into a db and
other advice which I appreciate. If I end up 'rolling my own' (i.e.,
writing a program -- I am about 1/2 done) I will certainly distribute it to
the group.
-- Rick
Rick Westerman
westerman@purdue.edu
Phone: (765) 494-0505 FAX: (765) 496-7255
S049 WSLR bldg. Purdue Univ. W. Lafayette, IN 47907-1153
Bioinformatics specialist at the Genomics Initiative.
Part time system manager of Biochemistry department.
href="http://www.biochem.purdue.edu/~westerm"