[Bioperl-l] BioPerl parse interproscan xml not working

blpapery at gmail.com blpapery at gmail.com
Fri Nov 8 07:08:58 UTC 2013


Thanks Chris, I just filed the bug.

Ben

On Thursday, November 7, 2013 9:25:43 PM UTC-5, Christopher Fields wrote:
>
> This looks like an XML parser issue, not BioPerl.  However, it seems 
> well-formed, and I can reproduce this.  Can you file this as a bug?  We may 
> need to switch out the backend parser, I don’t believe XML::Parser is 
> well=supported anymore. 
>
> chris 
>
> On Nov 6, 2013, at 9:35 AM, blpa... at gmail.com <javascript:> wrote: 
>
> > Hi all, 
> > 
> > I have been trying to use Bio::SeqIO to parse an XML interproscan result 
> > (XML version 1.0 is what interproscan outputs), 
> > but I keep getting the following error: 
> > 
> > no element found at line 24, column 0, byte 1421 at 
> > 
> /System/Library/Perl/Extras/5.10.0/darwin-thread-multi-2level/XML/Parser.pm 
> > line 187 
> > 
> > My code is below: 
> > 
> > use Bio::SeqIO; 
> > 
> > $io = Bio::SeqIO->new(-format => "interpro",-file   => "ipr.xml"); 
> > 
> >  while ($seq = $io->next_seq) { 
> >    print $seq->accession; # trying to print out anything here 
> >  } 
> > 
> > 
> > XML file is shown below: 
> > 
> > <?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
> > <protein-matches 
> > xmlns="http://www.ebi.ac.uk/interpro/resources/schemas/interproscan5"> 
> >    <protein> 
> >        <sequence 
> > 
> md5="d95d12290aaa87a91f47d25299cfb6ce">MKYKHLILSLSLIMLGPLAHAEEIGSVDTVFKMIGPDHKIVVEAFDDPDVKNVTCYVSRAKTGGIKGGLGLAEDTSDAAISCQQVGPIELSDRIKNGKAQGEVVFKKRTSLVFKSLQVVRFY 
>
> > DAKRNALAYLAYSDKVVEGSPKNAISAVPVMPWRQ</sequence> 
> >        <xref id="ecoli_3"/> 
> >        <matches> 
> >            <hmmer3-match evalue="1.0E-57" score="193.0"> 
> >                <signature ac="PF05981" desc="CreA protein" name="CreA"> 
> >                    <entry ac="IPR010292" desc="Uncharacterised protein 
> > family CreA" name="Uncharacterised_CreA" type="FAMILY"/> 
> >                    <models> 
> >                        <model ac="PF05981" desc="CreA protein" 
> > name="CreA"/> 
> >                    </models> 
> >                    <signature-library-release library="PFAM" 
> > version="27.0"/> 
> >                </signature> 
> >                <locations> 
> >                    <hmmer3-location env-end="157" env-start="24" 
> > score="192.8" evalue="1.2E-57" hmm-start="1" hmm-end="128" 
> hmm-length="0" 
> > start="24" end="156"/> 
> >                </locations> 
> >            </hmmer3-match> 
> >        </matches> 
> >    </protein> 
> > </protein-matches> 
> > 
> > 
> > 
> > 
> > Thanks in advance for your help. 
> > 
> > Ben 
> > _______________________________________________ 
> > Bioperl-l mailing list 
> > Biop... at lists.open-bio.org <javascript:> 
> > http://lists.open-bio.org/mailman/listinfo/bioperl-l 
>
>



More information about the Bioperl-l mailing list