[Bioperl-l] query name in xml blast report

Jason Stajich jason@cgt.mc.duke.edu
Mon, 21 Oct 2002 07:38:08 -0400 (EDT)


On Mon, 21 Oct 2002, gert thijs wrote:

> Hi,
>
> I am parsing blast reports in XML format with bioperl and this works fine
> except from the extraction of the query ID and description line. If I ask for
> the query name ($r->query_name ) and description ($r->query_description), I
> get empty strings and not values that are in the respective fields in the xml
> file.

You get nothing in both fields?  That is very strange.  Can you pls say
what version of bioperl you are using - I clearly test this in
t/SearchIO.t and I get the proper values out from a blastxml report for
all the local testing that I have tried.

The quotes should be converted by the decode_entities call in the code as
well using HTML::Entities so I am very confused why it isn't working for
you.

> In the XML file I find the following lines describing my query sequence:
> ----
>   <BlastOutput_query-ID>lcl|QUERY</BlastOutput_query-ID>
>   <BlastOutput_query-def>AF059581|INCLUSive|gene|47|1504|1|+|.|id SAHH  ;
> number 1  ; query &apos;AF059581 - SAHH&apos;;</BlastOutput_query-def>
>   <BlastOutput_query-len>701</BlastOutput_query-len>
> ----
> I guess the value in the 'BlastOutput_query-ID' field is set by the NCBI blast
> server while the value in the 'BlastOutput_query-def' field matches the header
> of my query sequence, but the quotes are changed to &apos;
> Has anyone a suggestion on how to get the full description line from the xml
> report?
>
>
> Gert
>
>
>
>
>

-- 
Jason Stajich
Duke University
jason at cgt.mc.duke.edu