[Bioperl-l] XML BLAST parsing & accessions
T.D. Houfek
tdhoufek@unity.ncsu.edu
Wed, 19 Jun 2002 20:16:30 -0400 (EDT)
A few days ago I decided to re-write a portion of a batch BLASTing system
I'm working on so that it performs its (XML) report parsing using
BioPerl(1.0) instead of my own home-grown parser. Specifically (in case
there's a whole other way of going about this), I am creating a
Bio::SearchIO object from a filehandle to an XML report:
my $searchio = new Bio::SearchIO(-tempfile => 1,
-format => 'blastxml',
-fh => $blastReport);
then $searchio->next_result() to get a Result object,
whose ->next_hit() method coughs up Hit objects, which in turn cough up
hsp objects with ->next_hsp().
And it all is working beautifully, I must say. The only problem I have
noticed, and it is kind of a problem, is that neither the Result object's
->query_name nor its ->query_accession method are returning anything for
me. I'm working with FASTA headers that look like this:
>gnl|NCSU_FGL.blast|03E20.Contig1 M. grisea project xsal BAC03E20 Contig 1
and I'm trying to get out of the corresponding BLAST report the bit the
first part of the header, i.e.
gnl|NCSU_FGL.blast|03E20.Contig1
I would have expected either ->query_name or ->query_accession to return
this. Have I violated a Bioperl expectation about header information
format? (This format doesn't prevent the information from appearing in
the XML reports themselves).
I appreciate any help you can give me,
TD
T.D. Houfek
system administrator
Fungal Genomics Laboratory
Center for Integrated Fungal Research (CIFR)
North Carolina State University
ph: (919)513-0025 e: tdhoufek@unity.ncsu.edu