[Bioperl-l] chromatogram
Smithies, Russell
Russell.Smithies at agresearch.co.nz
Wed Nov 14 20:58:19 UTC 2007
We try and avoid SVG at all costs as installing plugins and viewers in a
locked down corporate environment can be more trouble than it's worth
whereas generating .png images works for any browser with no extras
required.
We actually call this trace drawing code from Python which then
generates webpages with the embedded image.
It also means we don't need to licence, install and maintain a trace
viewer like Chromas.
:-)
Russell
> -----Original Message-----
> From: Malay [mailto:mbasu at mail.nih.gov]
> Sent: Thursday, 15 November 2007 9:47 a.m.
> To: Smithies, Russell
> Cc: Lee Katz; bioperl-l at lists.open-bio.org
> Subject: Re: [Bioperl-l] chromatogram
>
> I guess you need chromatogram from SCF. I can't help in that. ABI.pm
is
> not in Bioperl distribution. But to make the record straight, you can
> use one step chromatogram drawing in SVG from ABI file using my BioSVG
> module, available at:
>
> http://www.bioinformatics.org/~malay/biosvg/
>
> Malay
>
>
>
>
> Smithies, Russell wrote:
> > Here's my trace viewer.
> > Please excuse my dodgy Perl and debugging code as it's still under
> > development :-)
> >
> >
> > Russell Smithies
> >
> > Bioinformatics Software Developer
> > T +64 3 489 9085
> > E russell.smithies at agresearch.co.nz
> >
> > Invermay Research Centre
> > Puddle Alley,
> > Mosgiel,
> > New Zealand
> > T +64 3 489 3809
> > F +64 3 489 9174
> > www.agresearch.co.nz
> >
> >
> >
------------------------------------------------------------------------
> > ------------------
> >
> > #!perl -w
> > use ABI;
> >
> > use GD::Graph::lines;
> > use GD::Graph::colour;
> > use GD::Graph::Data;
> >
> > use Data::Dumper;
> >
> >
> > use Getopt::Long;
> >
> > use constant HEIGHT => 300;
> >
> > GetOptions ('h|height=i' => \$HEIGHT,
> > 'f|file=s' => \$FILE,
> > 'o|out=s' => \$OUTFILE,
> > 'l|left=s' => \$LEFT_SEQ,
> > 'r|right=s' => \$RIGHT_SEQ,
> > 's|size=i' => \$SIZE,
> > ) || die <<USAGE;
> > Usage: perl $0 -h 400 -f 1188_13_14728111_16654_48544_080.ab1 -o
> > test2.png -l actacgtacgta -r atgatcgtacgtac
> > or perl $0 --height 400 --file 1188_13_14728111_16654_48544_080.ab1
> > --out test2.png --left actacgtacgta --right atgatcgtacgtac
> >
> > Options:
> > --height <pixels> Set height of image (${\HEIGHT} pixels default)
> > --file <trace file-name> Filename for the ABI trace file
> > --out <output file-name> Filename for the generated .png image
> > --left <left end sequence>
> > --right <right end sequence>
> > --size <size of clipped fasta sequence>
> >
> > Parse an ABI trace file and render a PNG image.
> > See http://search.cpan.org/dist/ABI/ABI.pm
> > or
> > http://search.cpan.org/~bwarfield/GDGraph-1.44/Graph.pm
> > USAGE
> >
> > my $height = $HEIGHT || HEIGHT;
> > my $file = $FILE;
> > my $outfile = $OUTFILE;
> >
> > my $abi = ABI->new(-file=> $file);
> >
> > my @trace_a = $abi->get_trace("A"); # Get the raw traces for "A"
> > my @trace_c = $abi->get_trace("C"); # Get the raw traces for "C"
> > my @trace_g = $abi->get_trace("G"); # Get the raw traces for "G"
> > my @trace_t = $abi->get_trace("T"); # Get the raw traces for "T"
> >
> > my @base_calls = $abi->get_base_calls(); # Get the base calls
> > my $sequence =$abi->get_sequence();
> > @bp = split(//, $sequence);
> >
> >
> >
> > # iterate over array
> > $size = $abi->get_trace_length();
> > for ($i=0,$count = 0; $i<$size; $i++) {
> > if(grep(/\b$i\b/, @base_calls)){
> > $bases[$i] = $bp[$count];
> > $count++;
> > }else{
> > $bases[$i] = ' ';
> > }
> > }
> >
> > # create the data. see GD::Graph::Data for details of the format
> > my @data = (\@bases, \@trace_a, \@trace_c, \@trace_g, \@trace_t, );
> >
> > my $graph = new GD::Graph::lines($abi->get_trace_length(),$height);
> > $graph->set(
> > title => $abi->get_sample_name(),
> > # y_max_value => $abi->get_max_trace() + 50,
> > x_max_value => $abi->get_trace_length(),
> > t_margin => 5,
> > b_margin => 5,
> > l_margin => 5,
> > r_margin => 5,
> > x_ticks => 0,
> > text_space => 0,
> > line_width => 1,
> > transparent => 0,
> > b_margin => 30,
> > t_margin => 35,
> > x_plot_values => 0,
> > interlaced => 1,
> > );
> >
> > # allocate some colors for drawing the bases
> > #use colors same as Chromas
> > $graph->set( dclrs => [ qw( green blue black red pink) ] );
> >
> > #plot the data
> > my $gd = $graph->plot(\@data);
> >
> > $black = $gd->colorAllocate(0,0,0); # A
> > $blue = $gd->colorAllocate(0,0,255); # C
> > $red = $gd->colorAllocate(255,0,0); # G
> > $green = $gd->colorAllocate(0,255,0); # T
> > $magenta =$gd->colorAllocate(255,0,255); # N
> > $white = $gd->colorAllocate(255,255,255); # undefined aren't drawn
> > $gray = $gd->colorAllocate(210,210,210);
> > %colors = ("A", $green, "C", $blue, "G",$black, "T", $red, "N",
> > $magenta, " ",$white);
> >
> > #$start_base = index(lc($sequence),lc($LEFT_SEQ));
> > $start_base = find_match($sequence,$LEFT_SEQ);
> >
> > #if($end_base = rindex(lc($sequence),lc($RIGHT_SEQ)) > 0){
> > $end_base = find_match($sequence,$RIGHT_SEQ, 1);
> > if($end_base){
> > $end_base += length($RIGHT_SEQ);
> > }
> >
> >
> > # get the coords of the features on the image
> > @coords = $graph->get_hotspot(1);
> > $size = @coords;
> > $printed_num = 1;
> > $basecount = 0;
> > $numstoprint = $basecount - $start_base;
> >
> > # draw the colored bases and scale at top and bottom of image
> > for ($i=0,$count = 0; $i<$size; $i++) {
> > $c = $coords[$i];
> > (undef, $xs, undef, undef, undef, undef) = @$c;
> > $base = $bases[$i];
> > if($base =~ /[ACGTN]/){
> > if($start_base - 1 == $basecount){$start_base_coord = $xs;}
> > if($end_base - 1 == $basecount){$end_base_coord = $xs;}
> > if(defined($SIZE) && $start_base+$SIZE -2 ==
> > $basecount){$end_base_coord_by_size = $xs;}
> > $basecount++;
> > $numstoprint++;
> > $printed_num = 0;
> > }
> > # print the bases top and bottom
> > $gd->string(GD::Font->Small(),$xs,20,$base,$colors{$base});
> > $gd->string(GD::Font->Small(),$xs,$height -
30,$base,$colors{$base});
> >
> > # print scale
> > if($basecount > 0 && $numstoprint % 10 == 0 && $printed_num == 0){
> > if($LEFT_SEQ){
> > $gd->string(GD::Font->Small(),$xs,5,$numstoprint,$black);
> > $gd->string(GD::Font->Small(),$xs,$height -
> > 15,$numstoprint,$black);
> > $printed_num = 1;
> > }else{
> > $gd->string(GD::Font->Small(),$xs,5,$numstoprint,$black);
> > $gd->string(GD::Font->Small(),$xs,$height -
> > 15,$numstoprint,$black);
> > $printed_num = 1;
> > }
> > }
> > $top_right_corner = $xs;
> > }
> >
> >
> >
> > # only draw the clipped region if the calculated size is + or - 6bp
> > #if(($end_base - $start_base) - $SIZE <= 6 && ($end_base -
$start_base)
> > - $SIZE >= -6 ){
> > # draw the clipped regions as gray
> > #if LEFT_SEQ supplied and a match found
> > if($LEFT_SEQ && $start_base > 0){
> > $gd->filledRectangle(38,35,$start_base_coord - 1,$height -
> > 33,$red);
> > $clipped = 1;
> > }
> > #if RIGHT_SEQ supplied and a match found
> > if($RIGHT_SEQ && $end_base > 0){
> > print join("\t", ($end_base)),"\n";
> > $gd->filledRectangle($end_base_coord,35,$top_right_corner,$height
-
> > 33,$gray);
> > $clipped = 1;
> > }
> > #if no RIGHT_SEQ supplied or no match found, use left match + seq
> > length
> > if(!$RIGHT_SEQ || $end_base < 0){
> >
> >
$gd->filledRectangle($end_base_coord_by_size,35,$top_right_corner,$heigh
> > t - 33,$blue);
> > $clipped = 1;
> > }
> >
> >
> >
> > # set height based on max trace within clipped region
> > $graph->set( y_max_value => 3000);#$abi->get_max_trace() +
50);
> >
> > # need to re-plot the data over the grayed out area
> > $graph->plot(\@data) if $clipped;
> > $gd->filledRectangle(0,0,$top_right_corner,33,$white);
> >
> > #}
> >
> > #print the graph
> > open(OUT, ">$outfile") or die "can't open output file: $outfile\n";
> > binmode OUT;
> > print OUT $gd->png;
> > close OUT;
> >
> >
> > sub find_match{
> > my ($sequence,$query,$last) = @_;
> > return -1 if length($query) < 6;
> > my($odds, $evens, $ones, $twos, $threes, $match_pos);
> > # try exact match
> > $match_pos = do_regex($query, $sequence,$last); return
$match_pos if
> > $match_pos > 0;
> >
> > # try matching every second base starting from the second base
e.g.
> > it will be .C.T.C.G.etc
> > map {m/(\w)(\w)/g; $odds.="$1."; $evens.=".$2"}
> > ($query=~m/(\w\w)/g);
> > $match_pos = do_regex($odds, $sequence,$last); return
$match_pos
> > if $match_pos > 0;
> > $match_pos = do_regex($evens, $sequence,$last); return
$match_pos
> > if $match_pos > 0;
> >
> > # try matching every third base starting from the first base
e.g. it
> > will be C..T..G..T etc
> > map {m/(\w)(\w)(\w)/g; $ones.="$1.."; $twos.=".$2.";
> > $threes.="..$3"} ($query =~m/(\w\w\w)/g);
> > $match_pos = do_regex($ones, $sequence,$last); return
$match_pos
> > if $match_pos > 0;
> > $match_pos = do_regex($twos, $sequence,$last); return
$match_pos
> > if $match_pos > 0;
> > $match_pos = do_regex($threes, $sequence,$last); return
$match_pos
> > if $match_pos > 0;
> >
> > # not found
> > return -1;
> > }
> >
> > sub do_regex(){
> > my ($query,$sequence,$last)= @_;
> > #print "trying $query \n";
> > my $result = -1;
> > $result = pos($sequence)-length($query)+1 if $last &&
($sequence
> > =~ m/.*($query)/ig);
> > $result = pos($sequence)-length($query)+1 if($sequence =~
> > m/.*?($query)/ig);
> > return $result;
> > }
> >
> >
------------------------------------------------------------------------
> > ------------------
> >
> >> -----Original Message-----
> >> From: bioperl-l-bounces at lists.open-bio.org
> > [mailto:bioperl-l-bounces at lists.open-
> >> bio.org] On Behalf Of Lee Katz
> >> Sent: Wednesday, 14 November 2007 2:28 p.m.
> >> To: bioperl-l at lists.open-bio.org
> >> Subject: [Bioperl-l] chromatogram
> >>
> >> Hi,
> >> I would like to know how to draw a chromatogram file. Does anyone
> >> have any sample code where you read in an scf file and create a
jpeg
> >> or other image file?
> >> For that matter, I want to be able to customize these images with
base
> >> calls if possible. I really appreciate the help, so thanks!
> >>
> >> --
> >> Lee Katz
> >> _______________________________________________
> >> Bioperl-l mailing list
> >> Bioperl-l at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/bioperl-l
> >
> =============================================================
> ==========
> > Attention: The information contained in this message and/or
attachments
> > from AgResearch Limited is intended only for the persons or entities
> > to which it is addressed and may contain confidential and/or
privileged
> > material. Any review, retransmission, dissemination or other use of,
or
> > taking of any action in reliance upon, this information by persons
or
> > entities other than the intended recipients is prohibited by
AgResearch
> > Limited. If you have received this message in error, please notify
the
> > sender immediately.
> >
> =============================================================
> ==========
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at lists.open-bio.org
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
> --
> Malay K Basu
> www.malaybasu.net
=======================================================================
Attention: The information contained in this message and/or attachments
from AgResearch Limited is intended only for the persons or entities
to which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipients is prohibited by AgResearch
Limited. If you have received this message in error, please notify the
sender immediately.
=======================================================================
More information about the Bioperl-l
mailing list