[Bioperl-l] Fw: bad entries in interpro again (fwd)

Jared Fox jaredfox at ucla.edu
Mon Dec 6 22:59:41 EST 2004


The problem with Interpro XML is that there are entries like:

 <match id="SSF46785" name=""Winged helix" DNA-binding domain"
dbname="SUPERFAMILY">

or

<match id="SSF55486" name="Metalloproteases ("zincins"), catalytic domain"
dbname="SUPERFAMILY">

The double quotes are supposed to mark the beginning and end of the name
attribute, but the xml is not valid so it has double quotes inside the
attribute itself. I believe this also happens with other illegal xml 
characters.

If Interpro were to start producing valid XML, everything should work 
happily.

> ---------- Forwarded message ----------
> Date: Wed, 01 Dec 2004 16:16:46 +0000
> From: Mikko Arvas <Mikko.Arvas at vtt.fi>
> To: bioperl-l at portal.open-bio.org, Hilmar Lapp <hlapp at gmx.net>,
>     Allen Day <allenday at ucla.edu>
> Subject: bad entries in interpro again
>
> Hi,
>
> we've been discussing the problems of interpro parsing. I have a friend 
> who
> is going to interpro consortium meeting next week and I could send some
> regards through him. After reading your e-mails, I am (being quite a
> newbie) a little bit confused of what kind of regards would you like to
> send if any?
>
> Is the &apos the source of the problem? Is it really a problem in BioPerl
> or in expat? Is somebody trying to solve the problem for Bioperl now
> and is there any sensible thing that the interpro team could do to help?
>
> Cheers,
> mikko
>
> Mikko Arvas
> VTT Biotechnology
>
> e-mail:            mikko.arvas at vtt.fi
> tel:                 +358-(0)9-456 5827
> mobile:           +358-(0)44-381 0502
> fax:                +358-(0)9-455 2103
> mail:               Tietotie 2, Espoo
>                       P.O. Box 1500
>                       FIN-02044 VTT, Finland
> 



More information about the Bioperl-l mailing list