[Bioperl-l] XML output from RemoteBlast

Chris Fields cjfields at uiuc.edu
Thu Jan 19 17:53:01 UTC 2006


Jason, 

Nope.  No go.  I thought Nagesh may have found the problem with the $size
parameter (maybe the XML-formatted output was > 1000), but there is no $size
variable now.  RemoteBlast.pm was changed ~fall 2005 (by you, I believe) to
fix bug 1864 (http://bugzilla.bioperl.org/show_bug.cgi?id=1864), so is
post-1.5.1.  I'm using a recent PPM build of bioperl-live.  As reported
before, it worked up until very recently (within the last week), but I was
parsing text output and using '-readmethod'=>'SearchIO' or 'blast' in the
parameters list.  

My script uses a local sequence file (FASTA) in a BLASTP search against
'nr'.  When FORMAT_TYPE was set to 'Text' format using SearchIO for
readmethod, everything works fine and I get saved output; switching to
'readmethod'=>'xml' and FORMAT_TYPE to XML, gives a blank file.  The
-verbose switch is on, so I can switch FORMAT_TYPE to any of the accepted
parameter settings (HTML, Text, ASN.1, XML) and I see the corresponding
output style sent to stdout along with the warnings from the NCBI queue.
However, nothing besides text output will save, suggesting something with
retrieve_blast() in RemoteBlast.pm.  Strangely, the file name, derived from
query_name, does not pick up the query name sent, but a chunk of the RID!
BTW, it only does this with XML output; the query_name from text output is
as expected.  Changing $filename to temp.blastp (commented out below)
doesn't do the trick; it's still an empty file.  I have also tried an older
version of this script on Mac OS X and had similar problems with XML output,
but text output saves fine, so I don't think this is the OS.

Here's the saved file names (using XML output) and their RID's (no point in
sending the file contents, they were all blank). These were all using the
same query sequence; I noticed that the file names were different each time
and thought of the RID.

1_20910.blastp
  ^^^^^
1137691949-20910-102543092805.BLASTQ4
           ^^^^^

1_25245.blastp
  ^^^^^
1137692051-25245-128580015999.BLASTQ1		
           ^^^^^

1_21057.blastp
  ^^^^^
1137692263-21057-148127371984.BLASTQ4
           ^^^^^

Is the RID jamming up the works somehow?

Following is the script (sorry if it's a bit clunky)
____________________________________________________________________________
___
#!perl

use strict;
use Bio::Tools::Run::RemoteBlast;

# $v is just to turn on and off the messages
my $v = 1;

# changing or modifying parameters for blast search
my $prog = 'blastp';
my $db = 'nr';
my $e_val = '0.1';
my @params = (
		'-verbose' => $v,
		'-prog' => $prog,
		'-data' => $db,
		'-expect' => $e_val,
		'-readmethod' => 'xml'
		); 

# remove filter
delete $Bio::Tools::Run::RemoteBlast::HEADER{'FILTER'};

# change cgi parameters for blast results
# DESCRIPTIONS and ALIGNMENTS need to be changed in both the HEADER 
# and RETRIEVALHEADER hashes
$Bio::Tools::Run::RemoteBlast::RETRIEVALHEADER{'FORMAT_TYPE'} = 'XML';

# init new BLAST factory
my $factory = Bio::Tools::Run::RemoteBlast->new(@params);

print "Starting blast search ...\n";
# submit blast query
my $r = $factory->submit_blast('m_smeg_pyrR.txt');
print STDERR "waiting..." if( $v > 0 );
while ( my @rids = $factory->each_rid ) {
	foreach my $rid ( @rids ) {
		my $rc = $factory->retrieve_blast($rid);
		# if RID is not present
		if( !ref($rc) ) {
			# remove if RID is bad (error)
			if( $rc < 0 ) {
				$factory->remove_rid($rid);
			}
			# otherwise, query is still in progress, continue
loop, printing output
			# if requested
			print STDERR "." if ( $v > 0 ); 
			sleep 2; 
		} else { # RID is returned
			# save the output
			print $rid;
			my $result = $rc->next_result();
			my $filename= $result->query_name.".blastp";
			#my $filename= "temp.blastp";
			$factory->save_output($filename);
			# remove RID from list
			$factory->remove_rid($rid);
		}
	}
}
____________________________________________________________________________
___
I may switch to the blast client from NCBI for now, but I would like to keep
RemoteBlast.pm going somehow unless it's completely unfeasible.  I'm a still
a bit green when it comes to object-oriented programming (I am primarily a
molecular biologist with programming experience) and I'm still trying to
wrap my head around some bioperl objects and their methods (though I'm
catching on slowly).  

Christopher Fields
Postdoctoral Researcher - Switzer Lab
Dept. of Biochemistry
University of Illinois Urbana-Champaign 

> -----Original Message-----
> From: Jason Stajich [mailto:jason.stajich at duke.edu]
> Sent: Wednesday, January 18, 2006 10:23 PM
> To: Chris Fields
> Subject: Re: [Bioperl-l] XML output from RemoteBlast
> 
> This doesn't work for you?
> http://bioperl.open-bio.org/news/2005/11/06/getting-blastxml-using-
> remoteblast/
> On Jan 18, 2006, at 11:04 PM, Chris Fields wrote:
> 
> > Is there any known way to save XML-formatted BLAST queries from
> > RemoteBlast?  Changing the FORMAT_TYPE in the retrieval header to
> > anything other than 'Text' gives a blank output file.
> >
> > Christopher Fields
> > Postdoctoral Researcher
> > Lab of Dr. Robert Switzer
> > Dept of Biochemistry
> > University of Illinois Urbana-Champaign
> >
> >
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> 
> --
> Jason Stajich
> Duke University
> http://www.duke.edu/~jes12




More information about the Bioperl-l mailing list