[Bioperl-l] Re: Thanks
Barry Moore
barry.moore at genetics.utah.edu
Wed Aug 24 08:30:00 EDT 2005
Usha-
It is important that you keep Bioperl related discussions on the bioperl
list, that way others can benefit from the discussion in the future by
searching the archives. Having said that, I am constantly accidentally
replying directly to people and not to the list because I hit reply
instead of reply all, so I'm not really a good one to talk.
It seems from nature of your questions to this list that you might be
quite new to perl programming. You participation on the list is still
welcome, but this is the kind of problem that you want to learn how to
solve yourself. You want to to avoid at all cost giving the impression
that you are asking the list to do your debugging for you - otherwise
people will just stop replying to your messages. An very valuable
article about asking questions to forums like this one is found at
http://www.catb.org/~esr/faqs/smart-questions.html#answers
<http://www.catb.org/%7Eesr/faqs/smart-questions.html#answers>. Read
it. Live it. O.K. enough preaching...
If you have Programming Perl
<http://www.amazon.com/exec/obidos/tg/detail/-/0596000278/qid=1124883861/sr=8-1/ref=pd_bbs_1/102-6339742-6061723?v=glance&s=books&n=507846>
read the first half of chapter 20. If you don't have the Perl book, you
can get the same info from
http://www.perldoc.com/perl5.8.4/pod/perldebug.html. After you
understand the perl debugger (or if you already do) look at the error
message that you got. In the first line it reports "seq doesn't
validate". So something is wrong with some sequence that you are trying
to use in your script. Since perl itself isn't' aware of sequences, and
since the stack trace below shows that the exception occurred while in
Bio::PrimarySeq you can conclude that some sequence that your are
sending to bioperl is bad. Now fire up perl running your script with
the debugger like this perl -d yourscript.pl. Use 'n' to step through
your code to the open command on line 14. No errors yet? You've just
loaded your sequences.fasta sequence, so that sequence must be OK.
Continue stepping through your code until you get the error again.
Where exactly did the error occur? When you try to set $input2 as a new
Bio::Seq object? Run the debugger again, and step through to the line
just before where the error occurred. Use the debugger's x command to
see what the values of $seq_id, $probe_id, $position, $probe_sequence
are. This should give you a clue s to what your problem is. One more
clue comes from the error message. It says "Attempting to set the
sequence to [PROBE_SEQUENCE] which does not look healthy". The error
message says that you are trying to set the sequence to PROBE_SEQUENCE.
Try to figure out why this error is occurring and how to solve it. If
you're still stuck let us know what you've tried and ask us again.
Barry
Usha Rani Reddi wrote:
>Hi,
>Thanks a lot for your help. When I tried to run the given code I got the
>following message.
>
>MSG: seq doesn't validate, mismatch is 1
>---------------------------------------------------
>
>------------- EXCEPTION -------------
>MSG: Attempting to set the sequence to [PROBE_SEQUENCE] which does not
>look healthy
>STACK Bio::PrimarySeq::seq
>/usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:268
>STACK Bio::PrimarySeq::new
>/usr/lib/perl5/site_perl/5.8.5/Bio/PrimarySeq.pm:217
>STACK Bio::Seq::new /usr/lib/perl5/site_perl/5.8.5/Bio/Seq.pm:498
>STACK toplevel barr:23
>
>What should I do next? Please help me.
>Thanks
>Usha.
>
>----- Original Message -----
>From: Barry Moore <bmoore at genetics.utah.edu>
>Date: Tuesday, August 23, 2005 2:40 pm
>Subject: [Bioperl-l] RE: Local bl2seq
>
>
>
>>Usha-
>>
>>I think the code below will wrap your existing code in the loop you
>>need. You will want to get a copy of a good perl programming book
>>likeProgramming Perl from O'Reilly. It will help you out with all
>>thoselittle perl details like loop structures etc.
>>
>>Barry
>>
>>#!/usr/bin/perl
>>
>>use strict;
>>use warnings;
>>use Bio::SeqIO;
>>use Bio::Tools::Run::StandAloneBlast;
>>use Bio::Seq;
>>
>>my $seqio_obj = Bio::SeqIO->new(-file => " sequences.fasta",
>> -format => "fasta" );
>>
>>my $seq_obj = $seqio_obj->next_seq;
>>
>>open (IN, " location/of/your/probe/file") or die "Can't open IN";
>>
>>while (my $row = <IN>) {
>> chomp $row;
>> #Assuming your file is tab delimited...
>> my ($seq_id, $probe_id, $position, $probe_sequence) = split /\t/,
>>$row;
>>
>> my $input2 = Bio::Seq->new(-id=>"testquery2",
>> -seq=> $probe_sequence
>> );
>>
>> my $factory = Bio::Tools::Run::StandAloneBlast->new('program' =>
>>'blastn',
>> 'outfile' =>
>>'bl2seq1.out');
>>
>> my $blast_report = $factory->bl2seq($seq_obj, $input2);
>>
>> #Here is where you want to throw out good matches. You'll need to
>>determine
>> #what method you want to do that. Maybe since you want there
>>to be
>>no good
>> #hits you would just call $blast_report->max_significance and make
>>sure it's
>> #value is too high to be significant.
>> if ($blast_report->max_significance > 0.01) {
>> print "$row\n";
>> }
>>}
>>
>>-----Original Message-----
>>From: Usha Rani Reddi [mailto:ureddi at emich.edu]
>>Sent: Tuesday, August 23, 2005 5:51 AM
>>To: Barry Moore
>>Cc: bioperl-l at portal.open-bio.org
>>Subject: Local bl2seq
>>
>>Hi,
>>I am trying to use BLAST to compare the sequences. I did the
>>program in
>>Bioperl. Below is my piece of code
>>use Bio::SeqIO;
>>use Bio::Tools::Run::StandAloneBlast;
>>use Bio::Seq;
>>$seqio_obj = Bio::SeqIO->new(-file => "sequences.fasta",
>> -format => "fasta" );
>>$seq_obj = $seqio_obj->next_seq;
>>$input2 = Bio::Seq->new(-id=>"testquery2",
>>
>>-seq=>"ggacccgatgactagccccttgatcgtagcagtggcaagtca");
>>
>>$factory = Bio::Tools::Run::StandAloneBlast->new('program' =>
>>'blastn','outfile' => 'bl2seq1.out');
>>$blast_report = $factory->bl2seq($seq_obj, $input2);
>>
>>I need help for looping input2. I want to extract this part of
>>sequencefrom a file containing 200000 records. Using perl I am
>>extracting the
>>sequence part for file of format given below.
>>SEQ_ID PROBE_ID POSITION PROBE_SEQUENCE
>>NC_004116 1 1 AATTAACATTGTTGATTTTATTCTTCAACATC
>>NC_004116 3 13 TGATTTTATTCTTCAACATCTGTGGAAAACTT
>>NC_004116 5 25 TCAACATCTGTGGAAAACTTTATTTTTTTATG
>>
>>code for extracting PROBE_SEQUENCE looks like this
>>
>>$NemSeq =<STDIN>;
>>
>>chomp $NemSeq;
>>
>>unless (open(seqfile, $NemSeq)) {
>>print "Cannot open file \n";
>>exit;
>>}
>>@NemSeq = <seqfile> ;
>>
>>close seqfile;
>>
>>for (my $k = 0 ; $k < scalar @NemSeq ; ++$k) {
>> #print $k, $NemSeq[$k];
>> @Nem =split(/\t/,$NemSeq[$k]);
>> $input= $Nem[3];
>>
>> #print scalar(@Nem);
>> #print $Nem[3], "\n";
>>
>>}
>>
>>
>>@Nem =split(/\t/,$NemSeq)
>>
>>$input2 = substr(@NemSeq,4,32);
>>
>>So far I could successfully use bioperl(bl2seq) to compare whole
>>genomewith single probe.
>>I want to compare all the 200000 thousand probes. I am interested only
>>in mismatches, for this particular scenario my assumption is that more
>>than 90% of them will match. I want to send only the mismatches to
>>output file and discard the matches. I would like to classify the
>>mismatches based on the percentage dissimilarity, is there a way in
>>Bioperl for this? Thanks a lot for the reply. Please help me with
>>this.Thanks
>>Usha
>>
>>
>>----- Original Message -----
>>From: Barry Moore <barry.moore at genetics.utah.edu>
>>Date: Monday, August 22, 2005 11:45 pm
>>Subject: Re: [Bioperl-l] bl2seq
>>
>>
>>
>>>Usha,
>>>
>>>The best advice I can give you is that you need to focus your
>>>question a
>>>bit more. What method are you using to compare your probe to
>>>
>>>
>>your
>>
>>
>>>fasta? Regex, BLAST, Needle, RNAHybrid...? You say your
>>>
>>>
>>sequence
>>
>>
>>>is
>>>working fine for single sequence. Are you using Bioperl for
>>>
>>>
>>that?
>>
>>
>>>Can
>>>you tell us exactly what isn't working for you or what questions
>>>you
>>>have about working with multiple sequences? Are you already
>>>
>>>
>>using
>>
>>
>>>Bioperl with your single sequence comparison? Can you show us
>>>
>>>
>>some
>>
>>
>>>code?
>>>Barry
>>>
>>>Usha Rani Reddi wrote:
>>>
>>>
>>>
>>>>Hi,
>>>>I am trying to compare two hundred thousand probes(each one of
>>>>
>>>>
>>>them) to
>>>
>>>
>>>>another genome. Format of the file containing probes is like this
>>>>SEQ_ID PROBE_ID POSITION PROBE_SEQUENCE
>>>>NC_004116 1 1 AATTAACATTGTTGATTTTATTCTTCAACATC
>>>>NC_004116 3 13 TGATTTTATTCTTCAACATCTGTGGAAAACTT
>>>>NC_004116 5 25 TCAACATCTGTGGAAAACTTTATTTTTTTATG
>>>>NC_004116 7 37 GAAAACTTTATTTTTTTATGGTACAATATAAC
>>>>NC_004116 9 49 TTTTTATGGTACAATATAACAATAATTATCCA
>>>>NC_004116 11 61 AATATAACAATAATTATCCACAAGACAATAAG
>>>>NC_004116 13 73 ATTATCCACAAGACAATAAGGAAGAAGCTATG
>>>>NC_004116 15 85 ACAATAAGGAAGAAGCTATGACGGAAAACGAA
>>>>What I am trying to do is compare PROBE_SEQUENCE to fasta file of
>>>>Streptococcus agalactiae. I am trying to loop through the probes
>>>>
>>>>
>>>but not
>>>
>>>
>>>>sure how to proceed. My program is working fine for single
>>>>
>>>>
>>>sequence. One
>>>
>>>
>>>>more thing is I am not interested in matches, I want to display
>>>>
>>>>
>>only> >mismatches. I am new to Bioperl, some one please help me
>>with this.
>>
>>
>>>>Thanks
>>>>Usha
>>>>_______________________________________________
>>>>Bioperl-l mailing list
>>>>Bioperl-l at portal.open-bio.org
>>>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>>
>>>>
>>>>
>>>--
>>>Barry Moore
>>>Dept. of Human Genetics
>>>University of Utah
>>>Salt Lake City, UT
>>>
>>>
>>>
>>>
>>>
>>>
>>_______________________________________________
>>Bioperl-l mailing list
>>Bioperl-l at portal.open-bio.org
>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>>
--
Barry Moore
Dept. of Human Genetics
University of Utah
Salt Lake City, UT
More information about the Bioperl-l
mailing list