[Bioperl-l] how to get the information of Strand = Plus / Plus from blastn report by bioperl.

Frank Schwach fs5 at sanger.ac.uk
Mon Jan 16 09:32:03 UTC 2012


Hi Pawan ,

Please always "reply to all", so that you keep the discussion on the 
bioperl mailing list and more people can help you.
What you need is a very basic Perl command. I could give you the code 
but I think you get more out of it if you experiment with it on your own 
because it is very fundamental. I'll point you in the right direction: 
you want an if-then-else conditional construct.

Perl's documentation about this is here:
http://perldoc.perl.org/perlintro.html#Conditional-and-looping-constructs

if strand is 1 you want to print "PLUS" else if it is -1 you want to 
print "MINUS", or else you might want to print "no strand" or something, 
or even treat it as an error and make the script abort.

Give it a go and let us know if you need help. For basic (non-bio) Perl 
question, please also check out the community at http://www.perlmonks.org/.

Hope that helps,

Frank


On 14/01/12 05:59, kakchingtabam pawankumar sharma wrote:
> Hi frank,
>
> Thanks for your kind reply.
> I could get the vale for query as 1 value if it is plus.
> and for hit = -1 if it is minus.
> But i would like to print out as PLUS or MINUS not 1 or -1 my friend.
>
> you can see my code as below:
>
> while ( my $result = $searchio->next_result() ) {
>      my $QueryName = $result->query_name(), my $QueryDescript =
> $result->query_description();
>      my $QueryLength = $result->query_length;
>      my $NoHits = $result->num_hits;
>
>      while( my $hit = $result->next_hit ) {
>          my $HitName = $hit->name();
>          my $HitDescrip = $hit->description();
> 	my $HitLength = $hit->length;
>          my $Score = $hit->raw_score();
> 	my $Bits = $hit->bits;
>
>          my $hsp = $hit->next_hsp; # Only check first (= best) hsp
> 	my $Evalue =  $hsp->evalue();
> 	my $AlnLen = $hsp->num_identical();
> 	my $TotalLen = $hsp->hsp_length;
> 	my $QueryStrand = $hsp->strand('query');
> 	my $HitStrand = $hsp->strand('hit');
>
> 	if($Evalue<  $cutoff){
> 	    print "$QueryName $QueryDescript\t";
> 	    print "$QueryLength\t";
> 	    print "$NoHits\t";
> 	    print "$HitName $HitDescrip\t";
> 	    print "$HitLength\t";
> 	    print "$Score\t";
> 	    print "$Bits\t";
> 	    print "$Evalue\t";
> 	    print "$AlnLen\t";
> 	    print "$TotalLen\t";
> 	    print "$QueryStrand\t";
> 	    print "$HitStrand\n";
> 	}
>      }
>      print "\n";
> }
>
>
> This is a part of my code.
>
> i have blastn report as below:
>
> BLASTN 2.2.18 [Mar-02-2008]
>
>
> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
> programs",  Nucleic Acids Res. 25:3389-3402.
>
> Query= ORB_1210001_hsa-miR-548aa#5_1
>           (24 letters)
>
> Database: hsa-mmu-rno_miRNA.fa
>             3524 sequences; 76,424 total letters
>
> Searching..................................................done
>
>
>
>                                                                   Score    E
> Sequences producing significant alignments:                      (bits) Value
>
> hsa-miR-548aa
> 48   2e-009
> hsa-miR-548d-5p
> 36   9e-006
> hsa-miR-548b-5p
> 36   9e-006
> hsa-miR-548z
> 34   3e-005
> hsa-miR-548q
> 30   5e-004
> hsa-miR-548n
> 30   5e-004
> hsa-miR-548ab
> 28   0.002
> hsa-miR-548v
> 28   0.002
> hsa-miR-548c-5p
> 28   0.002
> hsa-miR-548ag
> 26   0.008
> hsa-miR-548u
> 26   0.008
> hsa-miR-548c-3p
> 26   0.008
> hsa-miR-603
> 26   0.008
> hsa-miR-548a-3p
> 26   0.008
> hsa-miR-548ac
> 24   0.033
> hsa-miR-548an
> 22   0.13
> hsa-miR-548aj
> 22   0.13
> hsa-miR-548i
> 22   0.13
> hsa-miR-548g
> 22   0.13
> hsa-miR-548j
> 22   0.13
> hsa-miR-548a-5p
> 22   0.13
>
>> hsa-miR-548aa
>            Length = 25
>
>   Score = 48.1 bits (24), Expect = 2e-009
>   Identities = 24/24 (100%)
>   Strand = Plus / Minus
>
>
> Query: 1  tggtgcaaaagtaattgtggtttt 24
>            ||||||||||||||||||||||||
> Sbjct: 25 tggtgcaaaagtaattgtggtttt 2
>
>
>> hsa-miR-548d-5p
>            Length = 22
>
>   Score = 36.2 bits (18), Expect = 9e-006
>   Identities = 18/18 (100%)
>   Strand = Plus / Plus
>
>
> Query: 7  aaaagtaattgtggtttt 24
>            ||||||||||||||||||
> Sbjct: 1  aaaagtaattgtggtttt 18
>
>
>
> in this result i could not parse my code. i think my code does not
> accept the Query header that is
> "ORB_1210001_hsa-miR-548aa#5_1" as it is in the above example blast output.
>
> kindly help me out.
>
> with regards,
> Pawan.
>
>
> On Sat, Jan 14, 2012 at 3:13 AM, Frank Schwach<fs5 at sanger.ac.uk>  wrote:
>> Hi Pawan,
>>
>> Can you show your code? Is it basically following the structure shown in
>> http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_SearchIO
>> ?
>>
>> If that is the case
>>
>> $hsp->strand('query')
>>
>>
>> is exactly what you need.
>> To check if hit and query are on different strands you can do:
>>
>> if ( $hsp->strand('query')
>> * $hsp->strand('hit') == -1){
>>
>>   # do whatever you need to do if they are on opposite strands
>>
>> }
>>
>> Hope that helps
>>
>> Frank
>>
>>
>>
>>
>>
>> On 13/01/12 16:46, kakchingtabam pawankumar sharma wrote:
>>> Hi,
>>>               Using Bio::SearchIO module I am parsing the following Blast
>>> result.
>>> I have used the option- $hsp->strand('query').
>>>
>>> But I cannot get detail of alignment.
>>>
>>> I need to know if my hit is forward (Strand = Plus / Plus)
>>> or reverse ( Strand = Plus / Minus)...
>>>   Can anyone help me to get report as Plus or Minus for query  or hit.
>>>
>>> thanks in advanced.
>>>
>>> With regards,
>>> Pawan
>>>
>>>
>>>
>>> BLASTN 2.2.18 [Dec-23-2011]
>>>
>>>
>>> Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
>>> Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
>>> "Gapped BLAST and PSI-BLAST: a new generation of protein database search
>>> programs",  Nucleic Acids Res. 25:3389-3402.
>>>
>>> Query= 000013_c10079-9984
>>>           (50 letters)
>>>
>>> Database: Cyano_Probe.fasta
>>>             4760 sequences; 238,000 total letters
>>>
>>> Searching..................................................done
>>>
>>>
>>>
>>>                                                                   Score
>>>   E
>>> Sequences producing significant alignments:                      (bits)
>>> Value
>>>
>>> 000013_c10079-9984
>>> 100   7e-024
>>> 002619_2689273-2690037
>>> 24   0.36
>>> 001126_c1123720-1123385
>>> 24   0.36
>>> 003211_c3326737-3326480
>>> 22   1.4
>>> 002415_2471082-2471420
>>> 22   1.4
>>> 002269_2321276-2322463
>>> 22   1.4
>>> 001328_c1326535-1326164
>>> 22   1.4
>>>
>>>> 000013_c10079-9984
>>>            Length = 50
>>>
>>>   Score = 99.6 bits (50), Expect = 7e-024
>>>   Identities = 50/50 (100%)
>>>   Strand = Plus / Plus
>>>
>>>
>>> Query: 1  agtcaacaccaatctgagtttaatcactatcttgatcatgttagatatca 50
>>>            ||||||||||||||||||||||||||||||||||||||||||||||||||
>>> Sbjct: 1  agtcaacaccaatctgagtttaatcactatcttgatcatgttagatatca 50
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>> --
>> The Wellcome Trust Sanger Institute is operated by Genome Research Limited,
>> a charity registered in England with number 1021457 and a company registered
>> in England with number 2742969, whose registered office is 215 Euston Road,
>> London, NW1 2BE.


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Bioperl-l mailing list