[Bioperl-l] bp_search2gff.pl and strand information

Siddhartha Basu sidd.basu at gmail.com
Tue Oct 20 16:11:12 UTC 2009


Hi, 
The bp_search2gff.pl seems to be not giving out strand information for
hit entries when i am trying a tblastn format conversion. Here is the
test case tried on sample 'tblastn' data file from the latest bioperl.

bp_search2gff.pl -i tblastn.out -o tblastn.gff3 -f blast -t hit \
-s tblastn_test --method match_part --addid --target --version 3

The output ....

gi|10040111|emb|AL390796.6|AL390796	tblastn_test	match_part	7603	7671 34	.	0	ID=HAHU;Target=Sequence:HAHU 56 78;score=34
gi|10040111|emb|AL390796.6|AL390796	tblastn_test	match_part	7069	7152 33	.	0	ID=HAHU;Target=Sequence:HAHU 31 58;score=33
test6	tblastn_test	match_part	3822	3848	30	.	2 ID=HAHU;Target=Sequence:HAHU 72 80;score=30
test6	tblastn_test	match_part	1794	1814	29	.	1 ID=HAHU;Target=Sequence:HAHU 93 99;score=29

I am not sure about the expected behaviour,  but assume it will give out
the strand information for hit as given in the HSP module documentation
and described in the GMOD
wiki(http://gmod.org/wiki/Load_BLAST_Into_Chado#Convert_BLAST_analysis_to_GFF3)

One thing that i noticed is in line 237 of the script(bp_search2gff.pl), 
$feature->strand ( $proxyfor->strand * $otherf->strand);
will definitely produce the 0(unknown) for tblastn as the query strand
is 0 for tblastn report. 

So,  is this behaviour is a feature or bug. What am i missing here.

thanks, 
-siddhartha






More information about the Bioperl-l mailing list