[Biojava-l] Creating an alignment object

Nathan S. Haigh n.haigh at sheffield.ac.uk
Mon May 15 09:24:27 UTC 2006


That's right, clustalw can output in several formats including fasta. It
would be nice to have Biojava able to read and write the clustalw format as
it is a widely used format. How, easy is it to write something like this?
Maybe when I start to learn more about Java I could have a go at doing this.

Nath

> -----Original Message-----
> From: mark.schreiber at novartis.com [mailto:mark.schreiber at novartis.com]
> Sent: 15 May 2006 10:16
> To: Richard Holland
> Cc: biojava-l at lists.open-bio.org; n.haigh at sheffield.ac.uk
> Subject: Re: [Biojava-l] Creating an alignment object
> 
> I think ClustalW can output alignments as fasta alignment format which
> biojava definitely can read.
> 
> - Mark
> 
> 
> 
> 
> 
> Richard Holland <richard.holland at ebi.ac.uk>
> Sent by: biojava-l-bounces at lists.open-bio.org
> 05/12/2006 04:34 PM
> 
> 
>         To:     n.haigh at sheffield.ac.uk
>         cc:     biojava-l at lists.open-bio.org, (bcc: Mark
> Schreiber/GP/Novartis)
>         Subject:        Re: [Biojava-l] Creating an alignment object
> 
> 
> Sorry for the delay in replying - I had to leave work a bit early
> yesterday.
> 
> > Nope, I don't need to generate an alignment, I already have an alignment
> in
> > a file created by third party software (clustalw).
> 
> There is nothing that I know of in BioJava that reads ClustalW files
> directly into Alignment objects. (If someone else knows different,
> please correct me). There are certainly methods in BioJava which read
> the alignments from ClustalW into a set of String objects, each one
> representing a member sequence (see SequenceAlignmentSAXParser), but I
> don't know of anything more detailed than that.
> 
> The third-party package called Strap which I mentioned yesterday happily
> reads/writes many of the major alignment formats, and has wrappers for
> running ClustalW and other aligners programatically and reading back in
> the results, so it is definitely worth a look. You can use a lot of its
> functions without having to run the GUI, including reading/writing
> various alignment formats.
> 
> >
> > In fact, the app I'd
> > eventually like to have written in Java would include some sort of
> wrapper
> > for clustalw in order to construct the alignments from a set of
> unaligned
> > sequences, but algorithms implemented in Biojava would also be a welcome
> > addition to the app.
> 
> If you want to wrap clustalw, the simplest way would be to create
> Sequence objects in BioJava, write them out to Fasta using the BioJava
> sequence IO tools, use the Java 'system' command (or one of the
> alternatives to it) to run ClustalW. However you still then have the
> problem of reading the output back in again.
> 
> The classes in org.biojava.bio.alignment that I mentioned yesterday
> implements several useful alignment algorithms which you can use as an
> alternative to ClustalW.
> 
> > But first things first.
> > If I didn't have any sequences or an alignment in any files. What is the
> > easiest way to get an alignment object in Java to have a play around
> with?
> 
> Make an instance of FlexibleAlignment from org.biojava.bio.alignment,
> and use its methods to add sequences to it. It doesn't do any aligning
> itself - it is just a placeholder to contain sequences and information
> about how they align. You have to use its methods to add and remove
> sequences from the alignment, to add/remove gaps and deletions, and get
> things like consensus sequences etc.
> 
> Technically I suppose you could use FlexibleAlignment in conjunction
> with SequenceAlignmentSAXParser to read alignment members as strings,
> construct sequences based on them, and add them to the alignment object,
> but I haven't tried this myself. It'd probably require some extra
> processing to convert the dashes (gaps) in the inputted strings into
> proper gaps in the alignment.
> 
> > Is there a way to just "magically" create a default alignment of say 5
> > sequences with 20 positions?
> 
> You'd have to manually create yourself 5 sequences and add them to a
> FlexibleAlignment as described above.
> 
> cheers,
> Richard
> 
> --
> Richard Holland (BioMart Team)
> EMBL-EBI
> Wellcome Trust Genome Campus
> Hinxton
> Cambridge CB10 1SD
> UNITED KINGDOM
> Tel: +44-(0)1223-494416
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biojava-l
> 
> 

---
avast! Antivirus: Outbound message clean.
Virus Database (VPS): 0619-3, 12/05/2006
Tested on: 15/05/2006 10:24:25
avast! - copyright (c) 1988-2006 ALWIL Software.
http://www.avast.com








More information about the Biojava-l mailing list