[Bioperl-l] convert fasta output to blast -m8?
Jason Stajich
jason.stajich at duke.edu
Tue May 24 12:17:57 EDT 2005
I think this all depends on your version of FASTA and Bioperl - there
were some changes in the FASTA output format which caused breakage in
older bioperl SearchIO:;fasta parser. I answered a similar question
recently on the list:
http://bioperl.org/pipermail/bioperl-l/2005-May/018870.html
Also if you are just doing -m8 output I would run fasta with -d 0 -m
9 options.
And if you really just want to do FASTA 2 BLAST tables (which I do
all the time for my stuff) and want a super-fast parser for this I
wrote a simple script in
scripts/searchio/fastam9_to_table.PLS
-jason
On May 24, 2005, at 11:14 AM, Amir Karger wrote:
> Hi.
>
> I've been asked to translate Fasta output to Blast -m8 output. I
> could do it
> by hand, but I have a feeling SearchIO & Writer can do this pretty
> easily.
> Can someone give me a couple hints?
>
> I tried running a ridiculously simple script on fasta -m9 output:
>
> use Bio::SearchIO;
> my $searchio = new Bio::SearchIO(-format => 'fasta',
> -file => 'short.out');
> while( my $result = $searchio->next_result ) {
> print $result->query_name;
> }
>
> And I got:
>
> Use of uninitialized value in concatenation (.) or string at
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/GenericHSP.pm
> line 231,
> <GEN1> line 61.
>
> ------------- EXCEPTION -------------
> MSG: Did not specify a Query End or Query Begin -verbose 0 -
> algorithm FASTP
> -score 186.3 -hit_frame 0 -hsp_length 300 -hit_seq
> PPPPPPTAETFDSDQTSSFSDINSTTASAPTTPAPALPPASPEVRKEETHPKHSLPPLPNQFAPLPDPPQ
> HNSPPQ
> NNAPSQPQSNPFPFPIPEIPSTQSATNPFPFPVPQQQ--
> FNQAPSMGIPQQNRPLPQLPNRNNRPVPPPPPMRTTT
> EGSGVRL---PAPPPP---PRRGPAPPPPPHRHVTSNTL------
> NSAGGNSLLPQATGRRGPAPPPPPRASRPTP
> NVTMQQNPQQYNNSNRPFGYQTNSNMSSPPPPPVTTFNTLTPQMTAATGQPAVPLPQNTQAPSQATNVPV
> AP
> -hit_length 300 -query_length 300 -query_frame 0 -swscore 212 -rank 1
> -query_seq
> MYQSMTVP-PFRPYGGDDIRVVSDLSRFDYQPDQKIRSRNPTPP---
> STINDNVSSSKLTLDTIIPLY---SSKID
> ERPKYSPLRQQEDRSTQYPSPPIPVKEEPTITIPKREKKKVRYSIGVQVPQDNGGISMTNNPAPPAPVPV
> PVPAPA
> PPPPPPKDIAPRSMPYPQDINNANNLPPMPQPTSQLYPQQQLPPLPYKDSSSITSPQKRLEKKLIKQVMN
> RPVIQF
> KADRFGQNYEGEYFTISANFVIYVFEVCCSVVEIVLSSILLQRDQDI -homology_seq
> :.: : : .. ..: . . : . . : :: : ..:. :. . .:. :..
> :
> :. ::. .: :: :: ...: .:.:... : ... ...:. . :::: : : .::
> :::::
> . .. .. :.:.. .:: :.. : :: . . ..: :
> -hit_name lcl|cerevisiae|YOR181W| -bits 44.0 -query_name
> lcl|albicans|CA0100| -evalue 8.3e-05 (qs='
> STACK Bio::Search::HSP::GenericHSP::new
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/GenericHSP.pm:231
> STACK Bio::Search::HSP::FastaHSP::new
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Search/HSP/FastaHSP.pm:97
> STACK Bio::Factory::ObjectFactory::create_object
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/Factory/ObjectFactory.pm:150
> STACK Bio::SearchIO::SearchResultEventBuilder::end_hsp
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/
> SearchResultEventBuilder.p
> m:275
> STACK Bio::SearchIO::fasta::end_element
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/fasta.pm:872
> STACK Bio::SearchIO::fasta::next_result
> /usr/local/lib/perl5/site_perl/5.8.4/Bio/SearchIO/fasta.pm:403
> STACK toplevel a.pl:8
>
> --------------------------------------
> lcl|albicans|CA0099|
>
> (The last thing is actually the query, so it's sort of doing the right
> thing. And line 61 of short.out (where the uninitialized value
> happens) is
> the beginning of the second hit.
>
> Running bp_filter_search.pl -format fasta -score 150 on the same
> output file
> produced no output at all. Is -m9 confusing it? Or is there some other
> problem?
>
> Pointers to docs etc. appreciated.
>
> -Amir Karger
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
More information about the Bioperl-l
mailing list