[Biojava-l] Questions

David Huen smh1008@cus.cam.ac.uk
Sat, 16 Mar 2002 18:38:27 +0000 (GMT)


On Sat, 16 Mar 2002, Andrey Zinovyev wrote:

Hi,
> I have several questions:
> 
> 1) I've read in the API about possibilities of organization of databases and
> accessing standart ones. I didnt understand exactly
> is there a way to extract automatically sequences from, say, GenBank. BTW,

If you mean "Does BioJava have a way of accessing some public repository
like EMBL and NCBI online to retrieve such a sequence?", then I don't
think so.  

The closest thing we have to this is our support for DAS which allows us
to download sequence and annotations from people offering DAS services.
But it does rely on the the availability of a DAS server for the sequences
you are after.  Fortunately, such services exist for many genomes although
I am uncertain whether all of them are ready for (or want) high volume
traffic.

If you are asking whether we can read and process files in such formats,
then I think we can safely answer yes.

> is there a kind of quering language for GenBank or EMBL?

> 
> 2) Did somebody implement such methods as Principal components analysis
> (Singular value decomposition), Self-organizing maps?

There was something for Support Vector Machines and I think there is
someone doing Java work on PCA although I cannot be certain that the work
will result in the committing of a new package (it's one thing to write
code for own use, entirely another to commit it for public use!).

> 
> 3) Is there any Java packages in the BioJava framework for analyse of
> microarrays data?
> 
Not as of now.  There is some nascent interest though.  On my part, I
might be tempted to proceed if I knew that ArrayExpress would emerge with
an overly restrictive licence.  We try not to reinvent the wheel  (would
this statement be an invitation to a troll? :-) ).

Does anyone know what licence ArrayExpress will adopt?

Regards,
David Huen