[Bioperl-l] mummer3 output format

Roy Chaudhuri roy.chaudhuri at gmail.com
Thu Mar 1 15:56:36 UTC 2012


Hi Albert,

The show-coords program converts the delta file into a coords file which 
is much easier to parse. It is run automatically if you provide the 
--coords flag to nucmer/promer.

There was talk of a BioPerl MUMmer parser a while back but I'm not sure 
if it got anywhere.

You might also look at Mugsy, which uses MUMmer and outputs MAF, so may 
contain some code that can be recycled - it is written in Perl I think.

Cheers,
Roy.

On 01/03/2012 15:45, Albert Vilella wrote:
> Hi,
>
> I am trying to understand how to transform Mummer3's output format
> into something I can pipe into another program, like MAF or similar.
> How can I parse the results so that I can then do a write_aln into MAF
> o similar?
>
> Details:
>
> If I run nucmer v.3.23 with the options below, I get an out.delta like this:
>
> ~/MUMmer3.23/nucmer -maxgap $g -l $l $ref $qry
>
> ------------------
> Leishmania_major.LM2.12.dna.toplevel.fa
> LtarParrotTarIIGenomic_TriTrypDB-4.0.fasta
> NUCMER
>> LmjF.34 ULAVAL|LtaPseq521 1866748 641
> 959335 959806 169 640 91 91 0
> 20
> 17
> -3
> -2
> -183
> 5
> 0
>> LmjF.12 ULAVAL|LtaPseq501 675346 1438
> 322990 324081 1436 342 178 178 0
> -45
> -1
> -1
> -1
>
> This doesn't look like any of the formats in t/AlignIO/mummer.t to me.
>
> I can also run:
>
> ~/MUMmer3.23/show-aligns out.delta $region1 $region2
>
> Which gives me something that looks like a blast or exonerate output, like so:
>
> ------
> Leishmania_major.LM2.12.dna.toplevel.fa
> LtarParrotTarIIGenomic_TriTrypDB-4.0.fasta
>
> ============================================================
> -- Alignments between LmjF.34 and ULAVAL|LtaPseq521
>
> -- BEGIN alignment [ +1 959335 - 959806 | +1 169 - 640 ]
>
>
> 959335     cacacgcctcgtagaggtctccttgctttcgcgcggtgc.c.tcacttg
> 169        cacacgcctcgtagagatc.ccctgccttcgcgcgg.gctcttcacttg
>                             ^  ^  ^   ^         ^  ^ ^
>
> 959382     cgcatgcggtagtagaagagaatgctgtgggcccacccagcgtagttgc
> 216        cgcatgcggtagtagaagagaatgctgtgtgcccacccagcgtagttgc
>                                          ^
>
> 959431     caaacagcttccggaaggcctcctgaatgacgttatgatgccgctcgta
> 265        caaacagtttccagaaggcatcctggataacattatgatgccgttcgta
>                    ^    ^      ^     ^  ^  ^           ^
>
> 959480     caagggtgggacaggcgtttttcgtgaggcgcgcagcggggctgctgca
> 314        caggggcggcacaggtgttttccgtaaggcacgtgaagaggtcgttgca
>               ^   ^  ^     ^     ^   ^    ^  ^^^^ ^  ^^ ^
>
> 959529     gagcttccaccttcctctatcgccttta.cggtcgctggcgacacgcct
> 363        gagcctccgtttcccttcaccgcccgcagcgat.gatgatgtcactcct
>                 ^   ^^^ ^   ^^ ^    ^^^ ^  ^ ^ ^  ^^ ^   ^
>
> 959577     ttcttaaccttgagaacctccgcctgcttcctccactccagcagcagat
> 411        ttcttcaccttgagagcctccgcctggttcttccactccaggagaagat
>                  ^         ^          ^   ^          ^  ^
>
> 959626     tatcccgtgagcgggcttcctcttcgggcaacggacaccctggacgaga
> 460        cagtgggtgcgcagacttcttcttcgcgcagtagagaccctgagcgaga
>             ^ ^^^^   ^  ^ ^    ^      ^   ^^^  ^      ^^
>
> 959675     gcgcttacgacccaccgccgtcgcggcgcttggtgcggcaaggtactcc
> 509        acgctttcgacccgccgatgtcacggtgcttgcggtggcaagatactcc
>             ^     ^      ^   ^^   ^   ^     ^^ ^      ^
>
> 959724     accgcaacttgcgccatgtgcgtgtccacggggacaatgtgggtgcggt
> 558        accgaaacctgcgccatgtgtgtgtccacggggacgatgtgggtgcggt
>                 ^   ^           ^              ^
>
> 959773     tgagcgcgaagagcgccacgcagtcagcaacttt
> 607        tgagagcaaagagcgccacgcaatccgccacttt
>                 ^  ^              ^  ^  ^
>
>
> --   END alignment [ +1 959335 - 959806 | +1 169 - 640 ]
>
> ============================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list