[Biojava-dev] Biojava-Mapreduce paradigm...

Prasanna Bala balkiprasanna1984 at gmail.com
Wed Mar 27 19:48:34 UTC 2013


Hi All...

I am Prasanna. I am working in the area of Machine Learning for more than 5
years. I have certain clarifications regarding contribution to Biojava
community. I have experience in mapreduce programming for more than 2
years. I have used it for applying various machine learning algorithms
using distributed computing. I have also used Mahout for various projects
from Stochastic Gradient Descent to Bayesian networks. I have lot of
interested and fascinating concepts I would like to contribute to
Bio-community. I am very much interested in implementing mapreduce
algorithms in the current field of Bioinformatics. It will be very useful
for people who are pursuing research in the area of Big data analytics. I
want to develop and release libraries like Mahout that uses Hadoop for
distributed computing. This can reduce the complexity of running the code
in Amazon Elastic MapReduce for large scale datasets.

Biomedical Text mining (Rule based and ML algorithms CRF, Maximum Entropy
Model, HMM):
1) Name Entity Recognition.
2) Information extraction.
3) Building networks and pathways from literature.

I have many more ideas regarding using ML algorithms in the Bioinformatics
domain. Would like to know if we can contribute our ideas using this
architecture.

Cheers,
Prasanna.



More information about the biojava-dev mailing list