[Bioperl-l] How to Handle Parse Errors
Hilmar Lapp
hlapp at gnf.org
Sat Jul 5 14:20:07 EDT 2003
Aaron's suggestion was to warn and skip the feature, right?
I agree with this. Handling feature semantic-carrying operators in the
location line requires more than a quick fix.
-hilmar
On Friday, July 4, 2003, at 07:25 AM, Heikki Lehvaslaiho wrote:
> dmcwilli,
>
> This is bug report #1371. I'll do the fixes Aaron suggests now so that
> we get the fixes inot bioperl release next week. If someone wants to do
> something more clever with undocumented features keys, feel free- but
> only in the cvs head.
>
> Thanks for reminding me of this,
>
> -Heikki
>
> On Fri, 2003-07-04 at 14:28, dmcwilli wrote:
>> There was a question like this in May, I think, but I have been unable
>> to find help for this in the FAQ or recent postings.
>>
>> I am trying to parse GenBank records and find those which have the
>> Feature /region_name="Transit peptide". I did a broad Entrez search
>> and downloaded the results, so I'm accessing the file locally. The
>> parser fails and exits the script prematurely when it encounters a
>> record
>> with the Feature "Het" with the message:
>>
>> -------------------- WARNING ---------------------
>> MSG: exception while parsing location line
>> [join(bond(201),bond(203),bond(204),bond(204),bond(204),bond(204))] in
>> reading EMBL/GenBank/SwissProt, ignoring feature Het (seqid=8RUC_G):
>> ------------- EXCEPTION -------------
>> MSG: operator "bond" unrecognized by parser STACK
>> Bio::Factory::FTLocationFactory::from_string
>> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:160
>> STACK Bio::Factory::FTLocationFactory::from_string
>> /usr/lib/perl5/site_perl/5.8.0/Bio/Factory/FTLocationFactory.pm:157
>> STACK (eval) /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:124
>> STACK Bio::SeqIO::FTHelper::_generic_seqfeature
>> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/FTHelper.pm:123 STACK
>> Bio::SeqIO::genbank::next_seq
>> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm:396 STACK toplevel
>> ./biopl5.pl:20
>> --------------------------------------
>> ---------------------------------------------------
>> Can't call method "primary_tag" on an undefined value at
>> /usr/lib/perl5/site_perl/5.8.0/Bio/SeqIO/genbank.pm line 400, <GEN0>
>> line 23630.
>> # end of message
>>
>> My code is:
>>
>> #!/usr/bin/perl
>> #
>> # tpfilter.pl
>> # Get transit peptides from files in genbank format. Uses BioPerl
>> # David R. McWilliams dmcwilli at utk.edu
>> # 04-Jul-03
>>
>> use strict;
>> use warnings ;
>> use Bio::SeqIO;
>> use Bio::Seq;
>>
>> my $file = shift @ARGV;
>> my $in = new Bio::SeqIO(-format => 'genbank', -file => $file);
>>
>> my $datetime = scalar(localtime()) ;
>> print "# Output of $0 on $file.\n" ;
>> print "# $datetime\n" ;
>>
>> my $fnd = 0 ;
>> while( my $seq = $in-> next_seq) {
>> foreach my $feature ( $seq->get_SeqFeatures ) {
>> if($feature->primary_tag eq 'Region' ) {
>> if( $feature->has_tag('region_name') ) {
>> my ($tag) = $feature->get_tag_values('region_name') ;
>> if( $tag =~ /transit|signal/i ) {
>> $fnd++ ;
>> print ">", $seq->display_id(), "|",
>> "tp=", $feature->start, "\.\.", $feature->end, "|",
>> $seq->species->binomial(), "|",
>> $seq->description(), "\n";
>> print $seq->subseq($feature->start, $feature->end), "\n" ;
>> }
>> }
>> }
>> }
>> }
>> print "# Found $fnd seqs w/ tp.\n" ;
>>
>> # end code
>>
>> If I remove the offending records by hand, this works fine. So, is
>> there a way to continue to parse the offending records, even though
>> the parser does not recognize this particular feature, or is there a
>> way to catch the error and skip the record without aborting the rest
>> of the script?
>>
>> Regards,
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> --
> ______ _/ _/_____________________________________________________
> _/ _/ http://www.ebi.ac.uk/mutations/
> _/ _/ _/ Heikki Lehvaslaiho heikki_at_ebi ac uk
> _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
> _/ _/ _/ Wellcome Trust Genome Campus, Hinxton
> _/ _/ _/ Cambs. CB10 1SD, United Kingdom
> _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list