[Bioperl-l] RE:Segmentation fault
stephan rosecker
stephan.rosecker at ish.de
Mon Apr 12 14:01:29 EDT 2004
> I really have no idea what you are doing....
:-)
I got an protein-profile-file build with hmmer. (nst.hmms).
So i use hmmpfam with a flatfile database "uniprot_sprot.dat" to find
proteins compatible with my profile.
(hmmpfam -A0 -E1e-30 nst.hmms uniprot_sprot.dat > hmmerresultfile.out)
So each result contains a hit or more is what i need.
This for example is good: (hmmerresultfile.out)
//
Query sequence: 128U_DROME
Accession: P32234
Description: GTP-binding protein 128UP.
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
Drg 639.4 1.6e-191 1
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
Drg 1/1 4 367 .. 1 375 [] 639.4 1.6e-191
Alignments of top-scoring domains:
Drg: domain 1 of 1, from 4 to 367: score 639.4, E = 1.6e-191
*->GIeEkIkaIEeEiarTqKNKATEyHLGlLKaKLAkLReQLlEpskgg
I+EkI aIE E+arTqKNKAT HLGlLKa AkLR +L s +g
128U_DROME 4 -ILEKISAIESEMARTQKNKATSAHLGLLKANVAKLRRELI--SPKG 47
gGGG...kGFeVeKSGdarvaLIGfPSVGKStLLakLTntkSEvAdYaFT
gGGG+++ GFeV+K Gdarv ++GfPSVGKStLL+ L + SEvA+Y+FT
128U_DROME 48 GGGGtgeAGFEVAKTGDARVGFVGFPSVGKSTLLSNLAGVYSEVAAYEFT 97
TLTcvPGvLeYqgakIQllDlPGIIeGAssGKGRGrqVlAvarsADLvLm
TLT vPG ++Y+gakIQllDlPGIIeGA GKGRGrqV+Avar+ L+ m
128U_DROME 98 TLTTVPGCIKYKGAKIQLLDLPGIIEGAKDGKGRGRQVIAVARTCNLIFM 147
vLDatkleehrdvLekELEnVGIRLnkrkGPniyfKkKetGGvsvngTgp
vLD k+ +h +Le ELE GIRLnk++ Pniy+K K GG++ n+ +p
128U_DROME 148 VLDCLKPLGHKKLLEHELEGFGIRLNKKP-PNIYYKRKDKGGINLNSMVP 196
PLThidEdtvrrILheYRIhNAeVliReDqvTlDdFIDVVnegNRvYipc
++d d+v++IL+eY+IhNA++ R D +T Dd IDV+ egNR Yipc
128U_DROME 197 -QSELDTDLVKTILSEYKIHNADITLRYD-ATSDDLIDVI-EGNRIYIPC 243
LyVyNKIDlvgieevdrlarkrlEiGpntIVvScnmglnLdelaerlwek
+y +NKID+++iee+d + + p + +S+ n d+l+e +we
128U_DROME 244 IYLLNKIDQISIEELDVIYKI-----PHCVPISAHHHWNFDDLLELMWEY 288
LglvRVYtKkqGkePDFDPLvepLilRrGsT.VeDvChhiHrDLvrrFKY
L l R+YtK++G+ PD+ p +l + T++eD+C+++Hr +++FKY
128U_DROME 289 LRLQRIYTKPKGQLPDYN---SPVVLHNERTsIEDFCNKLHRSIAKEFKY 335
AlVWGkSaKhppQrVGleHvleDeDvvqivvK<-*
AlVWG+S+Kh pQ+VG+eHvl DeDvvqiv+K
128U_DROME 336 ALVWGSSVKHQPQKVGIEHVLNDEDVVQIVKK 367
//
This for example is useless: (hmmerresultfile.out)
//
Query sequence: 10KD_VIGUN
Accession: P18646
Description: 10 kDa protein precursor (Clone PSAS10).
Scores for sequence family classification (score includes all domains):
Model Description Score E-value N
-------- ----------- ----- ------- ---
[no hits above thresholds]
Parsed for domains:
Model Domain seq-f seq-t hmm-f hmm-t score E-value
-------- ------- ----- ----- ----- ----- ----- -------
[no hits above thresholds]
Alignments of top-scoring domains:
[no hits above thresholds]
//
best regards,
stephan
>
>
> If you want to initialize a SearchIO parser to read in HMMER output
> results you do
>
> my $in = Bio::SearchIO(-file => 'hmmerresultfile.out',
> -format => 'hmmer');
>
> # an loop through each result (a single result per query sequence)
> while( my $result = $in->next_result ) {
> # print the query name
> # for hmmpfam this will be each sequence in the input set
> # for hmmsearch this will be the name of the HMM file you
> # used to search
> print $result->query_name, "\n";
>
> # now print all the hits (for HMMPFAM this will be the Pfam domains
> # for hmmsearch this will be the sequences in the db that had good
> # alignments
> while ( my $hit = $result->next_hit ) {
> # print the name, indented with a tab
> print "\t", $hit->name, "\n";
> }
> }
More information about the Bioperl-l
mailing list