[Biojava-l] Alignment objects

Nathan S. Haigh n.haigh at sheffield.ac.uk
Thu Aug 10 08:31:04 UTC 2006


Richard Holland wrote:
> You could change this:
>
> sym.getName().contains("[]")
>
> to this:
>
> AlphabetManager.getGapSymbol().equals(sym)
>
> Frequency calculations can be done quite quickly using DistributionTools:
>
>     Distribution[] dists = DistributionTools.distOverAlignment(algn,
> true);
> // true says to include gaps in the statistics
>     // The dists array will have the same number of entries as there
>     // are columns in the alignment.
>     for (int i = 0; i < dists.length; i++) {
>         // i = 0 = first column in alignment
>         Distribution dist = dists[i];
>         // Find out the weight for A in this column.
>         double AWeight = dist.getWeight(DNATools.a());
>         // Find out the weight for gaps in this column.
>         double GapWeight =
> dist.getWeight(DNATools.getDNA().getGapSymbol());
>     }
>
> cheers,
> Richard
This is definitely getting close to what i need. However, i think i'm
having trouble with alphabets which is stopping me from using soemthing
like:
AlphabetManager.getGapSymbol().equals(sym)

I currently creating an alignment like this:
    String alnString1 =
            ">seq1\n" +
            "----FGHIKLMNPQRST\n" +
            ">seq2\n" +
            "ACDEFGHIKLMNPQRST\n";
        BufferedReader br1 = new BufferedReader(new
StringReader(alnString1));
        FastaAlignmentFormat faf1 = new FastaAlignmentFormat();
        aln1 = faf1.read( br1 );

And i never get true returned from:
AlphabetManager.getGapSymbol().equals(sym)

I assume this is because the mechanisms that are in place for setting
the alphabet of the alignment are not correctly setting the gap symbol.
The program i am writing should be capable of determining the alphabet
of any alignment that is loaded, so it makes sense to change:
AlphabetManager.getGapSymbol().equals(sym)
to:
alignment.getAlphabet.getGapSymbol().equals(sym)

but this doesn't work either. Eventually i'd like my application to be
able to load alignment from several different formats, some of which may
use more than one symbol as the gap, while others have a "default" gap
character. Are there mechanisms in place to attempt to correctly set the
gapSymbol for an alignment? For example FASTA format alignments should
probably set the gap symbol to the hyphen "-".

Once again, being new to this, i am probably missing something that is
obvious to you guys.
Thanks for all your time end effort in helping me out.
Nathan




More information about the Biojava-l mailing list