[Bioperl-l] SimpleAlign Query
Fields, Christopher J
cjfields at illinois.edu
Wed Jun 19 13:15:58 UTC 2013
The alignments from BLAST reports are generally parsed via Bio::SearchIO. There is a HOWTO on SearchIO:
http://www.bioperl.org/wiki/HOWTO:SearchIO
Tiling in BioPerl uses these objects, but you can get Bio::SimpleAlign from the HSPs.
chris
On Jun 18, 2013, at 10:49 AM, "Yin, Jun" <jun.yin at yale.edu>
wrote:
> Hi, Chris,
>
> No problem. I can help to look at what is the problem. Yes, usually in
> Bio::SimpleAlign even if the sequence is reverse complementary, the start
> coordinate should be smaller than the end coordinate. But it seems
> Bio::AlignIO::bl2seq is a wrapper for Bio::SearchIO::blast. It is a bit
> different from the other Bio::AlignIO modules, so there may be some
> inconsistency.
>
> Can you send me the input file for your script? So I can try to figure out
> where the problem is.
>
> I would like to recommend you to ask in the bioperl mail list. There will
> be more experts in BioPerl there.
>
>
> Cheers,
> --
> Jun Yin, Ph.D.
> Postdoc Associate
>
> James Noonan Group
> Department of Genetics
> Yale University School of Medicine
> Sterling Hall of Medicine, I-120A
> 333 Cedar Street
> New Haven, CT 06510
> http://www.yale.edu/noonanlab/Jun.html
>
>
>
>
> On 6/18/13 11:23 AM, "Christopher Field" <chris.field at unibas.ch> wrote:
>
>> Dear Dr. Yin,
>>
>> I've got a problem with SimpleAlign.pm, and since you are responsible for
>> the most recent updates on this module, I hope it's ok to email and ask
>> for your help.
>> To summarise my short program, it uses bl2seq to throw some sanger reads
>> against a genome, and then I'm tiling the reported hits into a visually
>> accessible format.
>> The problem has arisen, that sometimes the alignment is with the reverse
>> complement, but the start and end of both sequences in the Align object
>> are such that the start is always lower than the end. The strand flag
>> also appears uninitialised. Is this a straightforward bug, or a result of
>> something I'm doing in my code?
>> Any help would be greatly appreciated.
>>
>> Chris Field
>> Group van Nimwegen
>> Biozentrum der Universität Basel
>> 4056-CH Basel
>>
>> Code:
>>
>> #!/usr/bin/env perl;
>>
>> use v5.8.1;
>> use warnings;
>> use strict;
>>
>> use Bio::Seq;
>> use Bio::SeqIO;
>> use Bio::AlignIO;
>> use Bio::SimpleAlign;
>> use Bio::Tools::Run::StandAloneBlast;
>>
>> my $usage = "align.pl reference query\n";
>>
>> open (my $fo,'>',"align.out",) or die $!;
>>
>> my $rfile = shift or die $usage;
>> my $qfile = shift or die $usage;
>>
>> my $rstream = Bio::SeqIO->new(-file => $rfile);
>> my $qstream = Bio::SeqIO->new(-file => $qfile);
>>
>> my $rseq = $rstream->next_seq;
>>
>> my $align = Bio::Tools::Run::StandAloneBlast->new(-program => 'blastn',
>> -outfile => 'bl2seq.out');
>>
>> while (my $qseq = $qstream->next_seq){
>> $align->bl2seq($qseq,$rseq);
>> my $astream = Bio::AlignIO->new(-file => 'bl2seq.out', -format =>
>> 'bl2seq');
>> my @seq = ("n") x $qseq->length;
>> my @match = (" ") x $qseq->length;
>> my @pos = (" ") x $qseq->length;
>> while (my $align = $astream->next_aln){
>> my $qhit = $align->get_seq_by_pos(1);
>> my $rhit = $align->get_seq_by_pos(2);
>> if(!grep(/[acgt]/, at seq[$qhit->start-1 .. $qhit->end-1])){
>> print $rhit->start,":",$rhit->end,"\n";
>> @seq[$qhit->start-1 .. $qhit->end-1] =
>> split("",$rhit->seq);
>> @match[$qhit->start-1 .. $qhit->end-1] =
>> split("",$align->match_line);
>> @pos[$qhit->start-1 .. $qhit->end-1] =
>> ("<","-","-",split("",$rhit->start),("-") x
>> ($qhit->length-(length($rhit->start)+length($rhit->end)+6)),split("",$rhit
>> ->end),"-","-",">");
>> }
>> }
>> print $fo $qseq->display_id,"\n";
>> print $fo $qseq->seq,"\n";
>> print $fo @match,"\n";
>> print $fo @seq,"\n";
>> print $fo @pos,"\n\n";
>> }
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list