[Bioperl-l] remoteblast xml problem
Hubert Prielinger
hubert.prielinger at gmx.at
Mon Jun 5 18:17:53 UTC 2006
hi,
you were right, removing the composition-based statistics solved the
problem. Now I get the result viewed on STDIN, but it doesn't save the
output in the file.
I haved tried it by reopening the file and writing it to an other file
again, but it doesn't work.....
The strange thing is that if I retrieve text instead of xml output it
works without any problem. Don't know why
Hubert
Chris Fields wrote:
> On Jun 2, 2006, at 8:36 PM, Hubert Prielinger wrote:
>
>
>> hi chris,
>> thanks but I never intended to run the remoteblast with so much,
>> only a few of them, acutally I goal is to run the phiblast with
>> regular expression, so that i just don't need that
>> file anymore
>>
>
> Not a problem. Just to let you know, I did manage to get the script
> working, so I'm marking the bug INVALID. I think the problem isn't
> that there is an infinite loop so much as setting composition-based
> statistics causes the search to take much much longer; try removing
> that line to see what I mean.
>
> Just so you know, using $result->query_name doesn't get you what you
> would expect (it gives you a part of the RID, which you don't want;
> this is something in the XML output that is beyond our control). You
> might want to change it to something else or you'll get filenames
> with numerical names.
>
>
>> another question for parsing the xml output....is there a xml
>> parser available for blast xml output or how to start.....
>> I have looked up at the wikiperl and cpan Bio::SearchIO::blastxml,
>> but I'm not sure how to start....sorry, I guess I'm too stupid....
>> is their maybe another introduction or an example.
>>
>
> Bio::SearchIO objects are used to parse BLAST XML output if you have
> it saved to a file. For instance:
>
> my $factory = Bio::SearchIO->new(-file => $file, -format => 'blastxml');
>
> while (my $result = $factory->next_result) {
> while (my $hit = $result->next_hit) {
> while (my $hsp = $hit->next_hsp {
> #do stuff here
> }
> }
> }
>
> The only thing that changes in parsing a text BLAST report from an
> XML BLAST report is the -format line (similar to the -readmethod
> parameter in RemoteBlast). You shouldn't need to look up any more
> documentation other than these on the wiki:
>
> http://www.bioperl.org/wiki/HOWTO:SearchIO
>
> http://www.bioperl.org/wiki/Module:Bio::SearchIO
>
> http://www.bioperl.org/wiki/Module:Bio::SearchIO::blastxml
>
> Pay attention to the fact you'll need to install XML::SAX (CPAN) and
> that XML::SAX::ExpatXS (and Expat) is highly recommended for speeding
> up parsing.
>
> Chris
>
>
>> thanks
>> Hubert
>>
>>
>> Chris Fields wrote:
>>
>>> Yes, I see the same error you do. But I have a similar script
>>> (blastp, XML blast report, XML parsing, similar loop structure)
>>> that works fine. I'm trying to dissect the problem but I think
>>> it may be something logically wrong here (something not so
>>> obvious) and not a bug...
>>>
>>> What I'm trying to say is, when you send sequences using
>>> remoteblast like, this you are essentially spamming the NCBI
>>> BLAST server with ~1600 requests. This script wasn't set up with
>>> that intent in mind; you should really try to set up your own
>>> local blast database if possible. If you can't, try running this
>>> script in off-hours (10pm-6am EST or something like that).
>>>
>>>
>>> Chris
>>>
>>> On Jun 2, 2006, at 7:49 PM, Hubert Prielinger wrote:
>>>
>>>
>>>
>>>> hi,
>>>> input database: swissprot
>>>> matrix: pam30
>>>> count: 1
>>>> gapcosts: 9 1
>>>>
>>>> I know that there are a lot of sequences, but that doesn't
>>>> matter, you can delete all of them except one, the amount of the
>>>> sequences is not the problem, the script reads one line and
>>>> submits it.....then the second line and so on.....I have tried
>>>> it with only one sequence either and I got the same result....
>>>> the script run at that time for more than 20
>>>> minutes!!!!!! .....and that should be enough time to retrieve
>>>> the results for ONE sequence, I guess
>>>>
>>>> regards
>>>> Hubert
>>>>
>>>>
>>>>
>>>> Chris Fields wrote:
>>>>
>>>>
>>>>> You need to add the input conditions as well (you have several
>>>>> <STDIN> lines which may play a role; I would like to know what
>>>>> you normally enter for those).
>>>>>
>>>>> How long did you let the script run? I ran a quick check on
>>>>> your sequences; you have almost 1600, so you have to expect
>>>>> that you'll run into some problems here! Most here (including
>>>>> me) would suggest you try installing a local blast setup for
>>>>> something like this.
>>>>>
>>>>> Chris
>>>>>
>>>>> On Jun 2, 2006, at 6:19 PM, Hubert Prielinger wrote:
>>>>>
>>>>>
>>>>>
>>>>>> hi,
>>>>>> I have submitted the bug -> Bug 2017
>>>>>> with the script and input file, just start it from command line
>>>>>>
>>>>>> thank you very much
>>>>>> greetings
>>>>>>
>>>>>> Hubert
>>>>>>
>>>>>> Chris Fields wrote:
>>>>>>
>>>>>>
>>>>>>> Hubert,
>>>>>>>
>>>>>>> I have a script that's using blastxml and XML output which
>>>>>>> seems to work.
>>>>>>> I'll try looking at it to get a better idea this weekend.
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>>>>> bounces at lists.open-bio.org] On Behalf Of Hubert Prielinger
>>>>>>>> Sent: Friday, June 02, 2006 4:12 PM
>>>>>>>> To: Chris Fields; bioperl-l at bioperl.org; Chris Fields;
>>>>>>>> 'Sendu Bala'
>>>>>>>> Subject: Re: [Bioperl-l] remoteblast xml problem
>>>>>>>>
>>>>>>>> hi,
>>>>>>>> sorry, but I have updated the remoteblast module and I have
>>>>>>>> run several
>>>>>>>> attempts with the same results as before. It didn't work.
>>>>>>>> I didn't get any results.
>>>>>>>>
>>>>>>>> regards
>>>>>>>> Hubert
>>>>>>>>
>>>>>>>>
>>>>>>>> Chris Fields wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Sendu, Hubert,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hubert, your code looks fine so Sendu's patch should fix
>>>>>>>>> the problem
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> (break
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> out of that infinite loop). I applied Sendu's patch to
>>>>>>>>> RemoteBlast in
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> CVS;
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> it passed all tests in RemoteBlast.t. Try updating from
>>>>>>>>> CVS to see if
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> it
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> works.
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
>>>>>>>>>> bounces at lists.open-bio.org] On Behalf Of Sendu Bala
>>>>>>>>>> Sent: Friday, June 02, 2006 4:04 AM
>>>>>>>>>> To: bioperl-l at lists.open-bio.org
>>>>>>>>>> Subject: Re: [Bioperl-l] remoteblast xml problem
>>>>>>>>>>
>>>>>>>>>> Hubert Prielinger wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> hi,
>>>>>>>>>>> I have the following program and it worked quite well,
>>>>>>>>>>> for retrieving
>>>>>>>>>>> remoteblast results in a textfile,
>>>>>>>>>>> now I have altered it to to xml, and it didn't work
>>>>>>>>>>> anymore.....
>>>>>>>>>>> it takes all the parameter at the commandline, submits
>>>>>>>>>>> the query, but
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>> I
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>>> don't retrieve any results file anymore.....
>>>>>>>>>>>
>>>>>>>>>>> it seems that it hangs in a endless loop......
>>>>>>>>>>> the only output I get is: $rc is not a ref! over and
>>>>>>>>>>> over..... it
>>>>>>>>>>> doesn't enter the else term anymore....
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> There is no problem with your code. The problem is with
>>>>>>>>>> the NCBI server
>>>>>>>>>> and should be reported to them. You can visit the site and
>>>>>>>>>> do a blast,
>>>>>>>>>> requesting xml format, and you will typically get one
>>>>>>>>>> normal 'waiting'
>>>>>>>>>> message and the promise that it will be updated in x
>>>>>>>>>> seconds, but
>>>>>>>>>> subsequent attempts to get progress information result in
>>>>>>>>>> an xml error
>>>>>>>>>> page because the NCBI server doesn't actually send any data.
>>>>>>>>>>
>>>>>>>>>> Unfortunately the way that the bioperl code is written, it
>>>>>>>>>> treats no
>>>>>>>>>> data as 'waiting' instead of an error. I've offered a
>>>>>>>>>> patch to fix this
>>>>>>>>>> at this bug page:
>>>>>>>>>> http://bugzilla.bioperl.org/show_bug.cgi?id=2015
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Bioperl-l mailing list
>>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Bioperl-l mailing list
>>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Bioperl-l mailing list
>>>>>>>> Bioperl-l at lists.open-bio.org
>>>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at lists.open-bio.org
>>>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>>
>>>>> Christopher Fields
>>>>> Postdoctoral Researcher
>>>>> Lab of Dr. Robert Switzer
>>>>> Dept of Biochemistry
>>>>> University of Illinois Urbana-Champaign
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>> Christopher Fields
>>> Postdoctoral Researcher
>>> Lab of Dr. Robert Switzer
>>> Dept of Biochemistry
>>> University of Illinois Urbana-Champaign
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>
>>>
>>>
>>>
>
> Christopher Fields
> Postdoctoral Researcher
> Lab of Dr. Robert Switzer
> Dept of Biochemistry
> University of Illinois Urbana-Champaign
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
>
More information about the Bioperl-l
mailing list