[Bioperl-l] mummer3 output format
Roy Chaudhuri
roy.chaudhuri at gmail.com
Thu Mar 1 15:56:36 UTC 2012
Hi Albert,
The show-coords program converts the delta file into a coords file which
is much easier to parse. It is run automatically if you provide the
--coords flag to nucmer/promer.
There was talk of a BioPerl MUMmer parser a while back but I'm not sure
if it got anywhere.
You might also look at Mugsy, which uses MUMmer and outputs MAF, so may
contain some code that can be recycled - it is written in Perl I think.
Cheers,
Roy.
On 01/03/2012 15:45, Albert Vilella wrote:
> Hi,
>
> I am trying to understand how to transform Mummer3's output format
> into something I can pipe into another program, like MAF or similar.
> How can I parse the results so that I can then do a write_aln into MAF
> o similar?
>
> Details:
>
> If I run nucmer v.3.23 with the options below, I get an out.delta like this:
>
> ~/MUMmer3.23/nucmer -maxgap $g -l $l $ref $qry
>
> ------------------
> Leishmania_major.LM2.12.dna.toplevel.fa
> LtarParrotTarIIGenomic_TriTrypDB-4.0.fasta
> NUCMER
>> LmjF.34 ULAVAL|LtaPseq521 1866748 641
> 959335 959806 169 640 91 91 0
> 20
> 17
> -3
> -2
> -183
> 5
> 0
>> LmjF.12 ULAVAL|LtaPseq501 675346 1438
> 322990 324081 1436 342 178 178 0
> -45
> -1
> -1
> -1
>
> This doesn't look like any of the formats in t/AlignIO/mummer.t to me.
>
> I can also run:
>
> ~/MUMmer3.23/show-aligns out.delta $region1 $region2
>
> Which gives me something that looks like a blast or exonerate output, like so:
>
> ------
> Leishmania_major.LM2.12.dna.toplevel.fa
> LtarParrotTarIIGenomic_TriTrypDB-4.0.fasta
>
> ============================================================
> -- Alignments between LmjF.34 and ULAVAL|LtaPseq521
>
> -- BEGIN alignment [ +1 959335 - 959806 | +1 169 - 640 ]
>
>
> 959335 cacacgcctcgtagaggtctccttgctttcgcgcggtgc.c.tcacttg
> 169 cacacgcctcgtagagatc.ccctgccttcgcgcgg.gctcttcacttg
> ^ ^ ^ ^ ^ ^ ^
>
> 959382 cgcatgcggtagtagaagagaatgctgtgggcccacccagcgtagttgc
> 216 cgcatgcggtagtagaagagaatgctgtgtgcccacccagcgtagttgc
> ^
>
> 959431 caaacagcttccggaaggcctcctgaatgacgttatgatgccgctcgta
> 265 caaacagtttccagaaggcatcctggataacattatgatgccgttcgta
> ^ ^ ^ ^ ^ ^ ^
>
> 959480 caagggtgggacaggcgtttttcgtgaggcgcgcagcggggctgctgca
> 314 caggggcggcacaggtgttttccgtaaggcacgtgaagaggtcgttgca
> ^ ^ ^ ^ ^ ^ ^ ^^^^ ^ ^^ ^
>
> 959529 gagcttccaccttcctctatcgccttta.cggtcgctggcgacacgcct
> 363 gagcctccgtttcccttcaccgcccgcagcgat.gatgatgtcactcct
> ^ ^^^ ^ ^^ ^ ^^^ ^ ^ ^ ^ ^^ ^ ^
>
> 959577 ttcttaaccttgagaacctccgcctgcttcctccactccagcagcagat
> 411 ttcttcaccttgagagcctccgcctggttcttccactccaggagaagat
> ^ ^ ^ ^ ^ ^
>
> 959626 tatcccgtgagcgggcttcctcttcgggcaacggacaccctggacgaga
> 460 cagtgggtgcgcagacttcttcttcgcgcagtagagaccctgagcgaga
> ^ ^^^^ ^ ^ ^ ^ ^ ^^^ ^ ^^
>
> 959675 gcgcttacgacccaccgccgtcgcggcgcttggtgcggcaaggtactcc
> 509 acgctttcgacccgccgatgtcacggtgcttgcggtggcaagatactcc
> ^ ^ ^ ^^ ^ ^ ^^ ^ ^
>
> 959724 accgcaacttgcgccatgtgcgtgtccacggggacaatgtgggtgcggt
> 558 accgaaacctgcgccatgtgtgtgtccacggggacgatgtgggtgcggt
> ^ ^ ^ ^
>
> 959773 tgagcgcgaagagcgccacgcagtcagcaacttt
> 607 tgagagcaaagagcgccacgcaatccgccacttt
> ^ ^ ^ ^ ^
>
>
> -- END alignment [ +1 959335 - 959806 | +1 169 - 640 ]
>
> ============================================================
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list