[Bioperl-l] About extracting sequence from genewise format result
zhaoy at mail.cbi.pku.edu.cn
zhaoy at mail.cbi.pku.edu.cn
Wed Aug 11 08:17:42 UTC 2010
Dear authors:
Hello!
Recently I am trying to parse the genewise format result for extracting
the nuclear sequence using method "hit_string" in module "SearchIO",
however, the result is empty. What's more terrible, the cycle seems not
working, because I always get the last result. I'm confused.
My perl code is shown below:
#!/usr/bin/perl -w
use strict;
use warnings;
use Bio::SearchIO;
my $in = new Bio::SearchIO(-format => 'wise',
-wisetype => 'genewise',
-file => 'test');
while( my $result = $in->next_result ) {
while (my $hit = $result->next_hit) {
while (my $hsp = $hit->next_hsp){
print "Query=", $result->query_name, "\n",
"Length=", $hsp->length('total'),"\n",
"hit_string:", $hsp->hit_string, "\n";
}
}
}
And one of the genewise format results is shown below:
genewise $Name: wise2-4-0alpha $ (unreleased release)
This program is freely distributed under a GPL. See source directory
Copyright (c) GRL limited: portions of the code are from separate copyright
Query protein: Cpa_s110_24
Comp Matrix: BLOSUM62.bla
Gap open: 12
Gap extension: 2
Start/End global
Target Sequence Bdi_chr3:38292015..38292302
Strand: forward
Start/End (protein) global
Gene Parameter file: gene.stat
Splice site model: GT/AG only
Codon Table: codon.table
Subs error: 1e-06
Indel error: 1e-06
Null model syn
Algorithm 623
genewise output
Score 37.97 bits over entire alignment
Scores as bits over a synchronous coding model
Warning: The bits scores is not probablistically correct for single seqs
See WWW help for more info
Cpa_s110_24 1 MGNCQAVDAATLAIQHPS-GKVDRLYWPVSASEVMRTNPGHYVALLI--
MGNCQA DAA + IQHP+ GKV+RLYWP +A++VMR NPGHYVAL++
MGNCQAADAAAVVIQHPAEGKVERLYWPATAADVMRKNPGHYVALVVVH
Bdi_chr3:382920 1 agatcggggggggacccgggaggccttcgaggggacaacgctggcgggc
tgagaccaccctttaaccagatagtagcccccattgaacgaatctttta
gctcgggtggcggcgcgcgggcgcccggccgcccgcgcccccccccccc
Cpa_s110_24 47 ----STTLCPSNSNASNAESVRVTRIKLLRPTDTLVLGQVYRLITTQEV
P+ + A + R+T++KLL+P DTL++GQVYRLIT+Q
VSGGAGETDPAVAGGGAAAAARITKVKLLKPRDTLLIGQVYRLITSQ--
Bdi_chr3:382920 148 gtgggggagcgggggggggggaaaagaccaccgaccagcgtccaatc
tcggcgacacctcgggcccccgtcatattacgactttgatagttcca
cctcctgtcccacaaaattccgccgcgccgcgctgcccgccccccca
Cpa_s110_24 92 MKGLWAKKCAKMKKYQEADHKDGLKPETIPGRRSGPERDTQVAKHERHR
-------------------------------------------------
Bdi_chr3:382920 289
Cpa_s110_24 141 SRVAASTNQAGLKSRTWQPSLKSISEAAS
-----------------------------
Bdi_chr3:382920 289
//
Gene 1
Gene 1 288
Exon 1 288 phase 0
Supporting 1 54 1 18
Supporting 58 141 19 46
Supporting 160 288 47 89
//
......
The part of output of this code is shown below:
Query=Aly_481360
Length=0
hit_string:
Query=Aly_481360
Length=0
hit_string:
......
What's wrong with my code and how can I get the correct result? I'm
looking forward to your reply.
Thanks very much!
Best regards,
Zackaly
More information about the Bioperl-l
mailing list