[Bioperl-l] problems running Bio::SearchIO on the FASTA results

Christie Robertson cpr at geospiza.com
Tue Dec 28 14:46:59 EST 2004


Great, thanks a lot, Jason!
Christie


On Tue, 28 Dec 2004, Jason Stajich wrote:

> Christie - this has to do with how the FASTA format has changed with
> the latest releases.  The parser has been updated to handle the changed
> format -update Bio/SearchIO/fasta.pm file from CVS or grab it here;
> http://bioperl.org/SRC/
>
> I did not put these changes on the 1.4 branch as I didn't think we'd be
> releasing off that branch, but I can merge the changes there as well if
> it will help people.
>
> -jason
>
> > Hi folks,
> >
> > I'm wondering if anybody here is currently parsing the results of
> > the FASTA program with Bio::SearchIO.  I'm running into a problem very
> > early on in the process, right at the moment of trying to parse a
> > result.
> > Here is a pared-down example program:
> >
> > >>>>>>
> >
> > use Bio::SearchIO;
> >
> > my $fastaFile = 'chWnt3_hg_Gnomon_prots_E0.001.out';
> > my $searchIO = new Bio::SearchIO(-format => 'fasta',
> >                                  -file => $fastaFile);
> >
> > my $result = $searchIO->next_result;
> >
> > <<<<<<<
> >
> > This program dies on the call to $searchIO->next_result() with this
> > message:
> >
> > >>>>>>>
> >
> > 1039 cpr at napa:~/fastaTest > ./bioperlFastaParseTest.pl
> > Use of uninitialized value in concatenation (.) or string at
> > /usr/lib/perl5/site_perl/5.8.0/Bio/Search/HSP/GenericHSP.pm line 231,
> > <GEN1> line 131.
> >
> > ------------- EXCEPTION  -------------
> > MSG: Did not specify a Query End or Query Begin -verbose 0 -algorithm
> > FASTP -hit_seq
> > CRNYIEIMPSVAEGVKLGIQECQHQFRGRRWNCTTIDDSLAIFGPVLDKATRESAFVHAIASAGVAFAVTR
> > SCAEGTSTICGCDSHHKGPPGEGWKWGGCSEDADFGVLVSREFADARENRPDARSAMNKHNNEAGRTTILD
> > HMHLKCKCHGLSGSCEVKTCWWAQPDFRAIGDFLKDKYDSASEMVVEKHRESRGWVETLRAKYSLFKPPTE
> > RDLVYYENSPNFCEPNPETGSFGTRDRTCNVTSHGIDGCDLLCCGRGHNTRTEKRKEKCHCIFHWCCYVSC
> > QECIRIYDVHTCK
> > -hit_length 297 -query_length 297 -query_frame 0 -rank 1 -hit_name
> > hmm6623
> > -query_name gi|18091804|gb|AAL58093.1| -evalue 0 -score 4361.0
> > -hit_frame
> > 0 -hsp_length 297 -swscore 3215 -query_seq
> > WNCTTIDDSLAIFGPVLDKATRESAFVHAIASAGVAFAVTRSCAEGTSTICGCDSHHKGPPGEGWKWGGCS
> > EDADFGVLVSREFADARENRPDARSAMNRHNNEAGRTTILDHMHLKCKCHGLSGSCEVKTCWWAQPDFRAI
> > GDYLKDKYDSASEMVVEKHRESRGWVETLRAKYALFKPPTERDLVYYENSPNFCEPNPETGSFGTRDRTCN
> > VTSHGIDGCDLLCCGRGHNTRTEKRKEKCHCIFHWCCYVSCQECIRVYDVHTCK
> > -homology_seq
> > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> > ::::::::::::::::::::::::::::.::::::::::::::::::::::::::::::::::::::::::
> > ::.::::::::::::::::::::::::::::::
> > :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
> > ::::::::::::.:::::::
> > -bits 815.4 (qs='
> > STACK Bio::Search::HSP::GenericHSP::new
> > /usr/lib/perl5/site_perl/5.8.0/Bio/Search/HSP/GenericHSP.pm:231
> > STACK Bio::Search::HSP::FastaHSP::new
> > /usr/lib/perl5/site_perl/5.8.0/Bio/Search/HSP/FastaHSP.pm:97
> > STACK Bio::Factory::ObjectFactory::create_object
> > /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/ObjectFactory.pm:150
> > STACK Bio::SearchIO::SearchResultEventBuilder::end_hsp
> > /usr/lib/perl5/site_perl/5.8.0/Bio/SearchIO/
> > SearchResultEventBuilder.pm:275
> > STACK Bio::SearchIO::fasta::end_element
> > /usr/lib/perl5/site_perl/5.8.0/Bio/SearchIO/fasta.pm:872
> > STACK Bio::SearchIO::fasta::next_result
> > /usr/lib/perl5/site_perl/5.8.0/Bio/SearchIO/fasta.pm:403
> > STACK toplevel ./bioperlFastaParseTest.pl:9
> >
> > --------------------------------------
> > 1040 cpr at napa:~/fastaTest >
> >
> > <<<<<<<
> >
> > Apparently, Bio::Search::HSP::GenericHSP.pm expects Query End and Query
> > Begin to be set, and isn't getting them.  Out of curiosity, I commented
> > the die line (231) from GenericHSP.pm, and then the module dies on the
> > next line, looking for Hit Begin and Hit End.  Did the FASTA output
> > format
> > get out of sync with SearchIO?  Am I missing something?
> >
> > I am attaching my output file.
> >
> > Thanks for any help!
> >
> > Christie
> >
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~
> > Christie P Robertson, PhD
> > Research Associate
> > Geospiza, Inc.
> >
> > cpr at geospiza.com
> > (206)633-4403
> > ~~~~~~~~~~~~~~~~~~~~~~~~~
> > -------------- next part --------------
> > # fasta chWnt3.fasta /usr/local/data/hg_Gnomon_prots.fsa 1 -E 0.001 -Q
> > -s P20
> > FASTA searches a protein or DNA sequence data bank
> >  version 3.4t24 July 21, 2004
> --
> Jason Stajich
> jason.stajich at duke.edu
> http://www.duke.edu/~jes12/
>
>
>


More information about the Bioperl-l mailing list