[Bioperl-l] How to Handle Parse Errors
Heikki Lehvaslaiho
heikki at ebi.ac.uk
Fri Jul 4 15:24:45 EDT 2003
dmcwilli,
This is bug report #1371. I'll do the fixes Aaron suggests now so that
we get the fixes inot bioperl release next week. If someone wants to do
something more clever with undocumented features keys, feel free- but
only in the cvs head.
Thanks for reminding me of this,
-Heikki
On Fri, 2003-07-04 at 14:28, dmcwilli wrote:
> There was a question like this in May, I think, but I have been unable
> to find help for this in the FAQ or recent postings.
>
> I am trying to parse GenBank records and find those which have the
> Feature /region_name="Transit peptide". I did a broad Entrez search
> and downloaded the results, so I'm accessing the file locally. The
> parser fails and exits the script prematurely when it encounters a record
> with the Feature "Het" with the message:
>
> -------------------- WARNING ---------------------
> MSG: exception while parsing location line
> [join(bond(201),bond(203),bond(204),bond(204),bond(204),bond(204))] in
> reading EMBL/GenBank/SwissProt, ignoring feature Het (seqid=8RUC_G):
> ------------- EXCEPTION -------------
> MSG: operator "bond" unrecognized by parser STACK
> Bio::Factory::FTLocationFactory::from_string
> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:160
> STACK Bio::Factory::FTLocationFactory::from_string
> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:157
> STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:124
> STACK Bio::SeqIO::FTHelper::_generic_seqfeature
> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:123 STACK
> Bio::SeqIO::genbank::next_seq
> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm:396 STACK toplevel
> ./biopl5.pl:20
> --------------------------------------
> ---------------------------------------------------
> Can't call method "primary_tag" on an undefined value at
> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm line 400, <GEN0>
> line 23630.
> # end of message
>
> My code is:
>
> #!/usr/bin/perl
> #
> # tpfilter.pl
> # Get transit peptides from files in genbank format. Uses BioPerl
> # David R. McWilliams dmcwilli at utk.edu
> # 04-Jul-03
>
> use strict;
> use warnings ;
> use Bio::SeqIO;
> use Bio::Seq;
>
> my $file = shift @ARGV;
> my $in = new Bio::SeqIO(-format => 'genbank', -file => $file);
>
> my $datetime = scalar(localtime()) ;
> print "# Output of $0 on $file.\n" ;
> print "# $datetime\n" ;
>
> my $fnd = 0 ;
> while( my $seq = $in-> next_seq) {
> foreach my $feature ( $seq->get_SeqFeatures ) {
> if($feature->primary_tag eq 'Region' ) {
> if( $feature->has_tag('region_name') ) {
> my ($tag) = $feature->get_tag_values('region_name') ;
> if( $tag =~ /transit|signal/i ) {
> $fnd++ ;
> print ">", $seq->display_id(), "|",
> "tp=", $feature->start, "\.\.", $feature->end, "|",
> $seq->species->binomial(), "|",
> $seq->description(), "\n";
> print $seq->subseq($feature->start, $feature->end), "\n" ;
> }
> }
> }
> }
> }
> print "# Found $fnd seqs w/ tp.\n" ;
>
> # end code
>
> If I remove the offending records by hand, this works fine. So, is
> there a way to continue to parse the offending records, even though
> the parser does not recognize this particular feature, or is there a
> way to catch the error and skip the record without aborting the rest
> of the script?
>
> Regards,
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list