[Bioperl-l] how to get the information of Strand = Plus / Plus from blastn report by bioperl.

kakchingtabam pawankumar sharma pawan.mani2 at gmail.com
Tue Jan 24 11:31:35 UTC 2012


Hi,
      Thanks a lot for help Frank. It works for every Blast output.
One more question is that i want to best hit only(top hit of every query).
I show there is option called
$obj->best_hit_only; in Bio::SearchIO module.
So help to add this to my script.
I could not do. Its confusing.

Thanks in Advanced.

With best regards,
Pawan


On Mon, Jan 16, 2012 at 9:05 PM, Frank Schwach <fs5 at sanger.ac.uk> wrote:
> Excellent, well done!
> No, this is the way to do it. In BioPerl modules that use strand information
> you will find the values +1/-1 or undef. If you want to display those as
> PLUS/MINUS,+/-,Watson/Crick,Laurel/Hardy whatever, you have to convert it,
> but you know now how to do it.
> You have a syntax error in your code where you retrieve the query name:
>
>
> my $QueryName = $result->query_name(), my $QueryDescript =
> $result->query_description();
>
> should be two lines and the comma should be a semicolon.
>
> Good luck!
>
> Frank
>
>
>
>
>
> On 16/01/12 15:14, kakchingtabam pawankumar sharma wrote:
>>
>> So By using the if else conditon function, I have solve Frank.
>> I mean is there anyway in bioperl we can get directly using other
>> module! I hope u got it!
>>
>>
>> So my second Question have not replied that is
>>
>> i have blastn report as below:
>>
>> BLASTN 2.2.18 [Mar-02-2008]
>>
>>
>> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
>> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
>> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
>> programs",  Nucleic Acids Res. 25:3389-3402.
>>
>> Query= ORB_1210001_hsa-miR-548aa#5_1
>>         (24 letters)
>>
>> Database: hsa-mmu-rno_miRNA.fa
>>           3524 sequences; 76,424 total letters
>>
>> Searching..................................................done
>>
>>
>>
>>                                                                 Score    E
>> Sequences producing significant alignments:                      (bits)
>> Value
>>
>> hsa-miR-548aa
>> 48   2e-009
>> hsa-miR-548d-5p
>> 36   9e-006
>> hsa-miR-548b-5p
>> 36   9e-006
>> hsa-miR-548z
>> 34   3e-005
>> hsa-miR-548q
>> 30   5e-004
>> hsa-miR-548n
>> 30   5e-004
>> hsa-miR-548ab
>> 28   0.002
>> hsa-miR-548v
>> 28   0.002
>> hsa-miR-548c-5p
>> 28   0.002
>> hsa-miR-548ag
>> 26   0.008
>> hsa-miR-548u
>> 26   0.008
>> hsa-miR-548c-3p
>> 26   0.008
>> hsa-miR-603
>> 26   0.008
>> hsa-miR-548a-3p
>> 26   0.008
>> hsa-miR-548ac
>> 24   0.033
>> hsa-miR-548an
>> 22   0.13
>> hsa-miR-548aj
>> 22   0.13
>> hsa-miR-548i
>> 22   0.13
>> hsa-miR-548g
>> 22   0.13
>> hsa-miR-548j
>> 22   0.13
>> hsa-miR-548a-5p
>> 22   0.13
>>
>>> hsa-miR-548aa
>>
>>          Length = 25
>>
>>  Score = 48.1 bits (24), Expect = 2e-009
>>  Identities = 24/24 (100%)
>>  Strand = Plus / Minus
>>
>>
>> Query: 1  tggtgcaaaagtaattgtggtttt 24
>>          ||||||||||||||||||||||||
>> Sbjct: 25 tggtgcaaaagtaattgtggtttt 2
>>
>>
>>> hsa-miR-548d-5p
>>
>>          Length = 22
>>
>>  Score = 36.2 bits (18), Expect = 9e-006
>>  Identities = 18/18 (100%)
>>  Strand = Plus / Plus
>>
>>
>> Query: 7  aaaagtaattgtggtttt 24
>>          ||||||||||||||||||
>> Sbjct: 1  aaaagtaattgtggtttt 18
>>
>>
>>
>> in this result i could not parse my code. i think my code does not
>> accept the Query header that is
>> "ORB_1210001_hsa-miR-548aa#5_1" as it is in the above example blast
>> output.
>>
>> kindly help me out.
>>
>> with regards,
>> Pawan.
>>
>> On 1/16/12, Frank Schwach<fs5 at sanger.ac.uk>  wrote:
>>>
>>> Hi Pawan ,
>>>
>>> Please always "reply to all", so that you keep the discussion on the
>>> bioperl mailing list and more people can help you.
>>> What you need is a very basic Perl command. I could give you the code
>>> but I think you get more out of it if you experiment with it on your own
>>> because it is very fundamental. I'll point you in the right direction:
>>> you want an if-then-else conditional construct.
>>>
>>> Perl's documentation about this is here:
>>> http://perldoc.perl.org/perlintro.html#Conditional-and-looping-constructs
>>>
>>> if strand is 1 you want to print "PLUS" else if it is -1 you want to
>>> print "MINUS", or else you might want to print "no strand" or something,
>>> or even treat it as an error and make the script abort.
>>>
>>> Give it a go and let us know if you need help. For basic (non-bio) Perl
>>> question, please also check out the community at
>>> http://www.perlmonks.org/.
>>>
>>> Hope that helps,
>>>
>>> Frank
>>>
>>>
>>> On 14/01/12 05:59, kakchingtabam pawankumar sharma wrote:
>>>>
>>>> Hi frank,
>>>>
>>>> Thanks for your kind reply.
>>>> I could get the vale for query as 1 value if it is plus.
>>>> and for hit = -1 if it is minus.
>>>> But i would like to print out as PLUS or MINUS not 1 or -1 my friend.
>>>>
>>>> you can see my code as below:
>>>>
>>>> while ( my $result = $searchio->next_result() ) {
>>>>      my $QueryName = $result->query_name(), my $QueryDescript =
>>>> $result->query_description();
>>>>      my $QueryLength = $result->query_length;
>>>>      my $NoHits = $result->num_hits;
>>>>
>>>>      while( my $hit = $result->next_hit ) {
>>>>          my $HitName = $hit->name();
>>>>          my $HitDescrip = $hit->description();
>>>>        my $HitLength = $hit->length;
>>>>          my $Score = $hit->raw_score();
>>>>        my $Bits = $hit->bits;
>>>>
>>>>          my $hsp = $hit->next_hsp; # Only check first (= best) hsp
>>>>        my $Evalue =  $hsp->evalue();
>>>>        my $AlnLen = $hsp->num_identical();
>>>>        my $TotalLen = $hsp->hsp_length;
>>>>        my $QueryStrand = $hsp->strand('query');
>>>>        my $HitStrand = $hsp->strand('hit');
>>>>
>>>>        if($Evalue<   $cutoff){
>>>>            print "$QueryName $QueryDescript\t";
>>>>            print "$QueryLength\t";
>>>>            print "$NoHits\t";
>>>>            print "$HitName $HitDescrip\t";
>>>>            print "$HitLength\t";
>>>>            print "$Score\t";
>>>>            print "$Bits\t";
>>>>            print "$Evalue\t";
>>>>            print "$AlnLen\t";
>>>>            print "$TotalLen\t";
>>>>            print "$QueryStrand\t";
>>>>            print "$HitStrand\n";
>>>>        }
>>>>      }
>>>>      print "\n";
>>>> }
>>>>
>>>>
>>>> This is a part of my code.
>>>>
>>>> i have blastn report as below:
>>>>
>>>> BLASTN 2.2.18 [Mar-02-2008]
>>>>
>>>>
>>>> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
>>>> Schaffer,
>>>> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
>>>> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
>>>> programs",  Nucleic Acids Res. 25:3389-3402.
>>>>
>>>> Query= ORB_1210001_hsa-miR-548aa#5_1
>>>>           (24 letters)
>>>>
>>>> Database: hsa-mmu-rno_miRNA.fa
>>>>             3524 sequences; 76,424 total letters
>>>>
>>>> Searching..................................................done
>>>>
>>>>
>>>>
>>>>                                                                   Score
>>>> E
>>>> Sequences producing significant alignments:                      (bits)
>>>> Value
>>>>
>>>> hsa-miR-548aa
>>>> 48   2e-009
>>>> hsa-miR-548d-5p
>>>> 36   9e-006
>>>> hsa-miR-548b-5p
>>>> 36   9e-006
>>>> hsa-miR-548z
>>>> 34   3e-005
>>>> hsa-miR-548q
>>>> 30   5e-004
>>>> hsa-miR-548n
>>>> 30   5e-004
>>>> hsa-miR-548ab
>>>> 28   0.002
>>>> hsa-miR-548v
>>>> 28   0.002
>>>> hsa-miR-548c-5p
>>>> 28   0.002
>>>> hsa-miR-548ag
>>>> 26   0.008
>>>> hsa-miR-548u
>>>> 26   0.008
>>>> hsa-miR-548c-3p
>>>> 26   0.008
>>>> hsa-miR-603
>>>> 26   0.008
>>>> hsa-miR-548a-3p
>>>> 26   0.008
>>>> hsa-miR-548ac
>>>> 24   0.033
>>>> hsa-miR-548an
>>>> 22   0.13
>>>> hsa-miR-548aj
>>>> 22   0.13
>>>> hsa-miR-548i
>>>> 22   0.13
>>>> hsa-miR-548g
>>>> 22   0.13
>>>> hsa-miR-548j
>>>> 22   0.13
>>>> hsa-miR-548a-5p
>>>> 22   0.13
>>>>
>>>>> hsa-miR-548aa
>>>>
>>>>            Length = 25
>>>>
>>>>   Score = 48.1 bits (24), Expect = 2e-009
>>>>   Identities = 24/24 (100%)
>>>>   Strand = Plus / Minus
>>>>
>>>>
>>>> Query: 1  tggtgcaaaagtaattgtggtttt 24
>>>>            ||||||||||||||||||||||||
>>>> Sbjct: 25 tggtgcaaaagtaattgtggtttt 2
>>>>
>>>>
>>>>> hsa-miR-548d-5p
>>>>
>>>>            Length = 22
>>>>
>>>>   Score = 36.2 bits (18), Expect = 9e-006
>>>>   Identities = 18/18 (100%)
>>>>   Strand = Plus / Plus
>>>>
>>>>
>>>> Query: 7  aaaagtaattgtggtttt 24
>>>>            ||||||||||||||||||
>>>> Sbjct: 1  aaaagtaattgtggtttt 18
>>>>
>>>>
>>>>
>>>> in this result i could not parse my code. i think my code does not
>>>> accept the Query header that is
>>>> "ORB_1210001_hsa-miR-548aa#5_1" as it is in the above example blast
>>>> output.
>>>>
>>>> kindly help me out.
>>>>
>>>> with regards,
>>>> Pawan.
>>>>
>>>>
>>>> On Sat, Jan 14, 2012 at 3:13 AM, Frank Schwach<fs5 at sanger.ac.uk>
>>>> wrote:
>>>>>
>>>>> Hi Pawan,
>>>>>
>>>>> Can you show your code? Is it basically following the structure shown
>>>>> in
>>>>> http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_SearchIO
>>>>> ?
>>>>>
>>>>> If that is the case
>>>>>
>>>>> $hsp->strand('query')
>>>>>
>>>>>
>>>>> is exactly what you need.
>>>>> To check if hit and query are on different strands you can do:
>>>>>
>>>>> if ( $hsp->strand('query')
>>>>> * $hsp->strand('hit') == -1){
>>>>>
>>>>>   # do whatever you need to do if they are on opposite strands
>>>>>
>>>>> }
>>>>>
>>>>> Hope that helps
>>>>>
>>>>> Frank
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 13/01/12 16:46, kakchingtabam pawankumar sharma wrote:
>>>>>>
>>>>>> Hi,
>>>>>>               Using Bio::SearchIO module I am parsing the following
>>>>>> Blast
>>>>>> result.
>>>>>> I have used the option- $hsp->strand('query').
>>>>>>
>>>>>> But I cannot get detail of alignment.
>>>>>>
>>>>>> I need to know if my hit is forward (Strand = Plus / Plus)
>>>>>> or reverse ( Strand = Plus / Minus)...
>>>>>>   Can anyone help me to get report as Plus or Minus for query  or hit.
>>>>>>
>>>>>> thanks in advanced.
>>>>>>
>>>>>> With regards,
>>>>>> Pawan
>>>>>>
>>>>>>
>>>>>>
>>>>>> BLASTN 2.2.18 [Dec-23-2011]
>>>>>>
>>>>>>
>>>>>> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A.
>>>>>> Schaffer,
>>>>>> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
>>>>>> "Gapped BLAST and PSI-BLAST: a new generation of protein database
>>>>>> search
>>>>>> programs",  Nucleic Acids Res. 25:3389-3402.
>>>>>>
>>>>>> Query= 000013_c10079-9984
>>>>>>           (50 letters)
>>>>>>
>>>>>> Database: Cyano_Probe.fasta
>>>>>>             4760 sequences; 238,000 total letters
>>>>>>
>>>>>> Searching..................................................done
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Score
>>>>>>   E
>>>>>> Sequences producing significant alignments:
>>>>>>  (bits)
>>>>>> Value
>>>>>>
>>>>>> 000013_c10079-9984
>>>>>> 100   7e-024
>>>>>> 002619_2689273-2690037
>>>>>> 24   0.36
>>>>>> 001126_c1123720-1123385
>>>>>> 24   0.36
>>>>>> 003211_c3326737-3326480
>>>>>> 22   1.4
>>>>>> 002415_2471082-2471420
>>>>>> 22   1.4
>>>>>> 002269_2321276-2322463
>>>>>> 22   1.4
>>>>>> 001328_c1326535-1326164
>>>>>> 22   1.4
>>>>>>
>>>>>>> 000013_c10079-9984
>>>>>>
>>>>>>            Length = 50
>>>>>>
>>>>>>   Score = 99.6 bits (50), Expect = 7e-024
>>>>>>   Identities = 50/50 (100%)
>>>>>>   Strand = Plus / Plus
>>>>>>
>>>>>>
>>>>>> Query: 1  agtcaacaccaatctgagtttaatcactatcttgatcatgttagatatca 50
>>>>>>            ||||||||||||||||||||||||||||||||||||||||||||||||||
>>>>>> Sbjct: 1  agtcaacaccaatctgagtttaatcactatcttgatcatgttagatatca 50
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>
>>>>>
>>>>> --
>>>>> The Wellcome Trust Sanger Institute is operated by Genome Research
>>>>> Limited,
>>>>> a charity registered in England with number 1021457 and a company
>>>>> registered
>>>>> in England with number 2742969, whose registered office is 215 Euston
>>>>> Road,
>>>>> London, NW1 2BE.
>>>
>>>
>>> --
>>>  The Wellcome Trust Sanger Institute is operated by Genome Research
>>>  Limited, a charity registered in England with number 1021457 and a
>>>  company registered in England with number 2742969, whose registered
>>>  office is 215 Euston Road, London, NW1 2BE.
>>>
>
>
> --
> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
> a charity registered in England with number 1021457 and a company registered
> in England with number 2742969, whose registered office is 215 Euston Road,
> London, NW1 2BE.




More information about the Bioperl-l mailing list