[Bioperl-l] Newbie Questions: bioperl, bioperl-db, and GO
Sean Davis
sdavis2 at mail.nih.gov
Thu Apr 14 02:55:56 EDT 2005
On Apr 14, 2005, at 12:26 AM, jjmail at mac.com wrote:
> Question 1:
>
> I am brand new to bioperl and the related projects so please forgive
> my ignorance on this. I have a large list of protein names and I would
> like to use bioperl to get the corresponding Gene Ontology (GO)
> information for each protein.
>
> So far I have installed bioperl, BioSQL, and bioperl-db and uploaded
> the taxonomy and GO information into BioSQL. I am having a really hard
> time figuring out how to get the GO information out of the database.
> If anyone knows the right doc to read or has a simple example program
> that I could see that would be really useful.
>
I see that Hilmar took a stab at answering your question on the details
of GO and BioSQL.
> Question 2:
>
> I have collected protein expression data for various states and I
> would like to cluster the data based on GO information for a start and
> then if possible use bioperl's ability to analyze mRNA array data to
> analyze the protein data. Does this seem reasonable? Where should I
> start looking to figure out how to do this?
>
This may reflect a bit of my own bias, but if you are looking at
expression (as in arrays, etc.), then I think the better tool to spend
time with is called BioConductor. It is a collection of tools written
for the R programming language (which you can install). Using
bioconductor, you can use the annotation building package (AnnBuilder)
to make an annotation package for all of the genes in your experiment.
The annotation package you create contains the GO information, biologic
pathways, chromosome locations, etc. Then you can use any one of
dozens of normalization and analysis or clustering methods to cluster
based on whatever you like, including some GO-based clustering.
Perl is just not the most natural tool for doing high-level, vectorized
math. BioConductor is built just for exploring data like array data
(or other high-throughput data).
Check out the site (http://www.bioconductor.org). There is also an
email list for bioconductor.
Sean
More information about the Bioperl-l
mailing list