[Bioperl-l] Need help: RemoteBlast Problem

Jason Stajich jason.stajich at duke.edu
Fri Nov 4 10:02:35 EST 2005


You can request the XML format and it should work fine.

my $remote_blastxml = Bio::Tools::Run::RemoteBlast->new
      ('-verbose'    => $v,
       '-prog'       => $prog,
       '-data'       => $db,
       '-readmethod' => 'xml',  # this tells the parser to use  
blastxml format for parsing
       '-expect'     => $e_val,
       );
$remote_blastxml->retrieve_parameter('FORMAT_TYPE', 'XML'); # this  
tells NCBI to send you XML back

There is code in the test file t/RemoteBlast.t test which does this  
as an example.


As I think I posted to the mailing list when we were making these  
changes to support more of the API - I think in Aug or Sept.  NCBI  
has said, they reserves the right to make the HTML & text output from  
the CGI unparseable so they can insert all the fancy links....  When/ 
if this has happened we will need to just disable HTML and plain text  
parsing all together with this module and force everything to use  
XML.  Someone else can do this, I do not intend to maintain this  
module (I've been saying this for years and yet I'm still working on  
it...) as I don't use it and people can use blastcl3 to achieve the  
same things.

Can someone help out and make sure this documentation and information  
makes it to the RemoteBlast module and the FAQ?

-jason

On Nov 4, 2005, at 6:43 AM, Stefan Wächter wrote:

> Hi,
>
> I wrote a Perl script a year ago. It use a few bioperl modules, one of
> them is RemoteBlast. This script worked fine until the beginning of
> september :-) .
> Read the articles in this news group I found out, that there happened
> some changes at the NCBI. Ok.
> So I installed bioperl-1.5.1 yesterday and run the script. First, it
> seemed to work fine, but suddenly it broke with this message:
>
> ------------- EXCEPTION  -----------
> MSG: no data for midline  Features flanking this part of subject  
> sequence:
> STACK Bio::SearchIO::blast::next_result
> /usr/lib/perl5/site_perl/5.8.6/Bio/SearchIO/blast.pm:1172
> STACK main::RemoteBlast ./cDNAComparer.pl:1223
> STACK main::runAnalyse ./cDNAComparer.pl:373
> STACK toplevel ./cDNAComparer.pl:1411
>
> --------------------------------------
>
> When I blast the Sequence directly at NCBI I found in the result page
> the line that seems to cause this break:
>
>
>> gi|61216116|ref|NG_001019.4| <http://www.ncbi.nlm.nih.gov/entrez/ 
>> query.fcgi? 
>> cmd=Retrieve&db=Nucleotide&list_uids=61216116&dopt=GenBank> Geo  
>> <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? 
>> db=geo&term=61216116%5Bgi%5D>Download subject sequence spanning  
>> the HSP <http://www.ncbi.nlm.nih.gov/blast/dumpgnl.cgi? 
>> db=nr&na=1&gnl=ref%7CNG_001019.4% 
>> 7C&gi=61216116&RID=1131096982-21134-187861614975.BLASTQ3&QUERY_NUMBER 
>> =1&segs=1057787-1058431,1159074-1159718,1085947-1086591,1182734-11833 
>> 78,1201786-1202430,1057611-1057679,1085771-1085839,1182558-1182626,12 
>> 01610-1201678,1158898-1158963> Homo sapiens immunoglobulin heavy  
>> locus (IGH@) on chromosome
>>
>             14
>           Length=1279711
>
>  Features flanking this part of subject sequence:
>    498 bp at 5' side: immunoglobulin heavy constant gamma 3 (G3m  
> marker), membr... <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi? 
> val=61216116&db=Nucleotide&from=1051799&to=1057290&view=gbwithparts>
>    21702 bp at 3' side: CDS <http://www.ncbi.nlm.nih.gov/entrez/ 
> viewer.fcgi? 
> val=61216116&db=Nucleotide&from=1080134&to=1081731&view=gbwithparts>
>
>  Score = 1235 bits (623),  Expect = 0.0
>  Identities = 641/646 (99%), Gaps = 1/646 (0%)
>  Strand=Plus/Plus
>
> Query  69        
> GCCCGCAGCCAGCCAGCCTCCATTCCGGGCACTCCCGTGAACTCCTGACATGAGGAATGA  128
>                  
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct  1057788   
> GCCCGCAGCCAGCCAGCCTCCATTCCGGGCACTCCCGTGAACTCCTGACATGAGGAATGA  1057847
>
> Query  129       
> GGTTGTTCTGATTTCAAGCAAAGAACGCTGCTCTCTGGCTCCTGGGAACAGTCTCGGTGC  188
>                  
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct  1057848   
> GGTTGTTCTGATTTCAAGCAAAGAACGCTGCTCTCTGGCTCCTGGGAACAGTCTCGGTGC  1057907
>
> Query  189       
> CAGCACCACCCCTTGGCTGCCTGCCTACACNTGCTGGATTCTCGGGTGGAACTCGACCCG  248
>
>
> What can I do to fix this problem ? Any ideas ?
>
> Cheers
> Stefan
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>

--
Jason Stajich
Duke University
http://www.duke.edu/~jes12





More information about the Bioperl-l mailing list