[Bioperl-l] XML BLAST parsing & accessions

T.D. Houfek tdhoufek@unity.ncsu.edu
Wed, 19 Jun 2002 20:16:30 -0400 (EDT)


A few days ago I decided to re-write a portion of a batch BLASTing system
I'm working on so that it performs its (XML) report parsing using
BioPerl(1.0) instead of my own home-grown parser.  Specifically (in case
there's a whole other way of going about this), I am creating a
Bio::SearchIO object from a filehandle to an XML report:

  my $searchio = new Bio::SearchIO(-tempfile => 1,
                                   -format => 'blastxml',
                                   -fh   => $blastReport);

then $searchio->next_result() to get a Result object,
whose ->next_hit() method coughs up Hit objects, which in turn cough up
hsp objects with ->next_hsp().

And it all is working beautifully, I must say.  The only problem I have
noticed, and it is kind of a problem, is that neither the Result object's
->query_name nor its ->query_accession method are returning anything for
me.  I'm working with FASTA headers that look like this:

>gnl|NCSU_FGL.blast|03E20.Contig1  M. grisea project xsal BAC03E20 Contig 1

and I'm trying to get out of the corresponding BLAST report the bit the
first part of the header, i.e.

gnl|NCSU_FGL.blast|03E20.Contig1

I would have expected either ->query_name or ->query_accession to return
this.  Have I violated a Bioperl expectation about header information
format?  (This format doesn't prevent the information from appearing in
the XML reports themselves).

I appreciate any help you can give me,
TD


T.D. Houfek

system administrator
Fungal Genomics Laboratory
Center for Integrated Fungal Research (CIFR)
North Carolina State University
ph: (919)513-0025  e: tdhoufek@unity.ncsu.edu