[Bioperl-l] Cigar?

Brian Osborne osborne1 at optonline.net
Sat Oct 22 12:53:31 EDT 2005


bioperl-l,

SimpleAlign::cigar_line appears to be broken, I¹d like to fix it. I¹m seeing
2 definitions of cigar format floating around, this one is from
http://www.ensembl.org/info/glossary.html:

Cigar - Cigar stands for Compact Idiosyncratic Gapped Alignment Report and
defines the sequence of matches/mismatches and deletions (or gaps). The
cigar line defines the sequence of matches/mismatches and deletions (or
gaps). For example, this cigar line 2MD3M2D2M will mean that the alignment
contains 2 matches/mismatches, 1 deletion (number 1 is omitted in order to
save some space), 3 matches/mismatches, 2 deletions and 2
matches/mismatches. If the original sequence is:

Original sequence: AACGCTT

The aligned sequence will be:

cigar line: 2MD3M2D2M
M    M    D    M    M    M    D    D    M    M
A    A    -    C    G    C    -    -    T    T


This one is from the SimpleAlign documentation:

 Function : Generates a "cigar" (Compact Idiosyncratic Gapped Alignment
            Report) line for each sequence in the alignment
            The format is simply A-1,60;B-1,1:4,60;C-5,10:12,58
            where A,B,C, etc. are the sequence identifiers, and the numbers
            refer to conserved positions within the alignment


Bioperl uses this second, yes?


Brian O.

PS There was no test for cigar_line in SimpleAlign.t





More information about the Bioperl-l mailing list