[BioRuby] BLAST hit, which strand

Naohisa GOTO ngoto at gen-info.osaka-u.ac.jp
Fri Apr 25 06:23:23 UTC 2008


Hi,

Perhaps your BLAST result is XML (with -m 7 option).
You can use query_frame and hit_frame methods.
In most case, the start and end positions of query and hit
can also be used to distinguish strands.

Unlike default format, NCBI BLAST XML does not clearly show
query-/hit- strand information. That's why we have no
query_strand and hit_strand methods in BLAST XML output parser.

In addition, it is confusable that the alignment shown in
NCBI BLAST XML is often opposite from that of default output.

For example, below are the results with same query and database.
(blastall 2.2.15 in debian etch).

The query sequence:
>query
cgatcgatcgatagctaggcgactagcatatctaacatcgtacacatactggcat

The hit sequence in database:
>test
nnnnnnnnnnnnnnnnnnnn
atgccagtatgtgtacgatgttagatatgctagtcgcctagctatcgatcgatcg

default output:
---------------------------------------------------------------------
 Score =  109 bits (55), Expect = 4e-30
 Identities = 55/55 (100%)
 Strand = Plus / Minus


Query: 1  cgatcgatcgatagctaggcgactagcatatctaacatcgtacacatactggcat 55
          |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 75 cgatcgatcgatagctaggcgactagcatatctaacatcgtacacatactggcat 21

---------------------------------------------------------------------

XML output:
---------------------------------------------------------------------
<Hsp>
  <Hsp_num>1</Hsp_num>
  <Hsp_bit-score>109.522</Hsp_bit-score>
  <Hsp_score>55</Hsp_score>
  <Hsp_evalue>3.75451e-30</Hsp_evalue>
  <Hsp_query-from>55</Hsp_query-from>
  <Hsp_query-to>1</Hsp_query-to>
  <Hsp_hit-from>21</Hsp_hit-from>
  <Hsp_hit-to>75</Hsp_hit-to>
  <Hsp_query-frame>1</Hsp_query-frame>
  <Hsp_hit-frame>-1</Hsp_hit-frame>
  <Hsp_identity>55</Hsp_identity>
  <Hsp_positive>55</Hsp_positive>
  <Hsp_align-len>55</Hsp_align-len>
  <Hsp_qseq>ATGCCAGTATGTGTACGATGTTAGATATGCTAGTCGCCTAGCTATCGATCGATCG</Hsp_qseq>
  <Hsp_hseq>ATGCCAGTATGTGTACGATGTTAGATATGCTAGTCGCCTAGCTATCGATCGATCG</Hsp_hseq>
  <Hsp_midline>|||||||||||||||||||||||||||||||||||||||||||||||||||||||</Hsp_midline>
</Hsp>
---------------------------------------------------------------------

Both <Hsp_qseq> and <Hsp_hseq> are reverse-complement of
those of default output, and the <Hsp_qseq> is
reverse-compement of the query sequence. 
The XML shows <Hsp_query-frame> = 1 and <Hsp_hit-frame> = -1,
and I think these are opposite from our feeling.
It might be a problem of NCBI BLAST.

Regards,

Naohisa Goto
ngoto at gen-info.osaka-u.ac.jp / ng at bioruby.org


On Thu, 24 Apr 2008 14:08:00 +0200
Marc Hoeppner <marc.hoeppner at molbio.su.se> wrote:

> Hi,
> 
> I have a question concerning BLAST::RESULT objects in BioRuby.
> I followed the tutorial and available documentation to integrate blast 
> into my application, but I noticed that pretty much everything is 
> returned from BLAST except the strand a particular hit or hsp was 
> identified on.
> It is available for objects of the class 
> Bio::Blast::Default::Report::HSP (according to the API documentation 
> anyway) - but what I get back when I run BLAST according to the tutorial 
> is an object of the class Bio::Blast::Report::HSP - which does not 
> return the 'hit_strand'.
> 
> Is it me, am I missing something or is this a problem with BioRuby?
> 
> Cheers,
> Marc
> 
> -- 
> 
> Marc P. Hoeppner
> PhD student
> Department of Molecular Biology and Functional Genomics
> Stockholm University, 10691 Stockholm, Sweden
> 
> marc.hoeppner at molbio.su.se
> Tel: +46 (0)8 - 164195
> 



More information about the BioRuby mailing list