[Bioperl-l] Re: bad entries in interpro again
Hilmar Lapp
hlapp at gnf.org
Thu Dec 2 16:45:52 EST 2004
On Dec 2, 2004, at 6:04 AM, Dave Howorth wrote:
> The file contains many lines identical to the one cited, which are all
> valid XML in accordance with the Interpro DTD, but none are line 2! So
> it looks like different data has been passed to XML::Parser.
Well, yes, you can't translate the line# given by the error message
into line# in the source file. SeqIO::interpro chops up the input at
<protein>...</protein> and then passes each chunk to the XML::Parser
instance.
There is no other editing of the chunks going on though except for a
haphazard substitution of certain double-quotes. In order to see the
chunk before it gets sent to the parser instance edit
Bio/SeqIO/interpro.pm and before the line
$self->parse_xml($xml_fragment);
put a print statement that prints out the content of $xml_fragment.
That should also give the exact source XML that trips up the parser.
-hilmar
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------
More information about the Bioperl-l
mailing list