[BioPython] import Standalone problems
Jacob Joseph
jmjoseph at andrew.cmu.edu
Thu Jul 20 20:05:14 UTC 2006
Great!
Can someone point me to the current maintainer of the Blast parsing package?
-Jacob
Rohini Damle wrote:
> Hi,
> I used hsp.evalue instead of hsp.expect and I am getting the desired
> output.
>
> Thank you very much for your help, efforts, and all those modified files.
> Rohini
>
> On 7/20/06, Rohini Damle <rohini.damle at gmail.com> wrote:
>> Hi,
>> Now I used your updated Record.py, NCBIXML.py and NcbiStandalone.py
>> (all updated)
>> I am not getting that previous error.
>> BUT I am still not getting the desired output ...
>> Here is my code
>>
>> blast_out = open("C:/Documents and Settings/rdamle/My
>> Documents/Rohini's Documents/Blast
>> Parsing/onlymouse4proteinblastout.xml", "r")
>>
>> b_parser = NCBIXML.BlastParser()
>> b_iterator = NCBIStandalone.Iterator(blast_out, b_parser)
>> E_VALUE_THRESH = 22
>>
>> for b_record in b_iterator :
>> for alignment in b_record.alignments:
>> for hsp in alignment.hsps:
>> if (hsp.expect< E_VALUE_THRESH):
>> print b_record.query.split()[0]
>> print '****Alignment****'
>> print 'sequence:',
>> alignment.title.split()[0]
>>
>>
>> with this code I was expecting to get all the alignments with
>> hsp.expect<E_VALUE_THRESH
>>
>> BUT I AM GETTING ALL the alignments not just the one with evalue <22
>> -Rohini.
>>
>>
>>
>>
>>
>> On 7/20/06, Jacob Joseph <jmjoseph at andrew.cmu.edu> wrote:
>> > Hi. I suspect you are not using my updated Record.py. You'll notice
>> > that, at least for the moment, I have changed _blast.gap_penalties
>> to an
>> > array to allow assignment per item without worrying about the order of
>> > entries within the xml file. There are other ways this could be
>> > accomplished while still using a tuple.
>> >
>> > -Jacob
>> >
>> > Rohini Damle wrote:
>> > > Hi,
>> > > When I tried on your NCBIXML.py code instead of oringinal one I am
>> > > getting following error messege:
>> > >
>> > > File "C:\Python24\lib\site-packages\Bio\Blast\NCBIXML.py", line 210,
>> > > in _end_Parameters_gap_open
>> > > self._blast.gap_penalties[0] = int(self._value)
>> > > TypeError: object does not support item assignment
>> > >
>> > > in the original version
>> > > we don't have that " [0] " in self._blast.gap_penalties
>> > >
>> > > what might be causing this error?
>> > > -Rohini
>> > >
>> > > On 7/19/06, Jacob Joseph <jmjoseph at andrew.cmu.edu> wrote:
>> > >> I do not believe the current version of the parser will work with
>> > >> multiple queries using recent version of blast, regardless of the
>> output
>> > >> format. I do know that blastall 2.2.13 with XML functions with the
>> > >> parser corrections previously attached. I have attached a further
>> > >> updated NCBIXML.py, fixing the performance issues in parse() that I
>> > >> mentioned.
>> > >>
>> > >> -Jacob
>> > >>
>> > >> Rohini Damle wrote:
>> > >> > Hi,
>> > >> > Can someone suggest me for which version of Blast, the Biopython's
>> > >> > (text or xml) parser works fine?
>> > >> > I will download that blast version locally and can use biopython's
>> > >> parser.
>> > >> > thanx,
>> > >> > Rohini
>> > >> >
>> > >> > On 7/18/06, Jacob Joseph <jacob at jjoseph.org> wrote:
>> > >> >> Hi.
>> > >> >> I encountered similar difficulties over the past few days
>> myself and
>> > >> >> have made some improvements to the XML parser. Well, that is,
>> it now
>> > >> >> functions with blastall, but I have made no effort to parse
>> the other
>> > >> >> blast programs. I do not expect I have done any harm to other
>> > >> parsing,
>> > >> >> however.
>> > >> >>
>> > >> >> Attached are Record.py, NCBIStandalone.py, and NCBIXML.py. I
>> have not
>> > >> >> yet spent significant time to clean up my changes. Without
>> getting
>> > >> into
>> > >> >> specific modifications, I have made an effort to make
>> consistent the
>> > >> >> variables in Record and NCBIXML, focusing primarily on what I
>> needed
>> > >> >> this week.
>> > >> >>
>> > >> >> One portion I am not settled on reinitialization of
>> Record.Blast at
>> > >> >> every call to iterator.next(), and, by extension,
>> BlastParser.parse().
>> > >> >> See NCBIXML.py, line 114. Without re-initializing this class,
>> we run
>> > >> >> the risk of retaining portions of a Record from previously parsed
>> > >> >> queries. This causes the bug 1970, mentioned below.
>> Unfortunately,
>> > >> >> this re-initialization exacts a significant performance
>> penalty of at
>> > >> >> least a factor of 10 by some rough measures. I would
>> appreciate any
>> > >> >> suggestions for improvement here.
>> > >> >>
>> > >> >> I do apologize for not being more specific about my changes.
>> When
>> > >> I get
>> > >> >> a chance(next week?), I will package them up as a proper patch
>> and
>> > >> file
>> > >> >> a bug. Perhaps what I have done so far will be of use until
>> then.
>> > >> >>
>> > >> >> fyi, I have done all of my testing with Blast 2.2.13. 2.2.14
>> seems to
>> > >> >> not have separate <?xml> blocks within its output, requiring a
>> > >> different
>> > >> >> method of iteration.
>> > >> >>
>> > >> >> -Jacob
>> > >> >>
>> > >> >> Peter wrote:
>> > >> >> > Rohini Damle wrote:
>> > >> >> >> Hi,
>> > >> >> >> I have a XML file with 4 blast records (for proteins P1,
>> P2, P3,
>> > >> P4)
>> > >> >> >> I am trying to extract alignment information for each of them.
>> > >> >> >> So I wrote the following code:
>> > >> >> >>
>> > >> >> >> for b_record in b_iterator :
>> > >> >> >>
>> > >> >> >> E_VALUE_THRESH =20
>> > >> >> >> for alignment in b_record.alignments:
>> > >> >> >> for hsp in alignment.hsps:
>> > >> >> >> if hsp.expect< E_VALUE_THRESH:
>> > >> >> >>
>> > >> >> >> print '****Alignment****'
>> > >> >> >> print 'sequence:',
>> > >> >> alignment.title.split()[0]
>> > >> >> >>
>> > >> >> >> With this code, I am getting information for P1,
>> > >> >> >> then information for P1 + P2
>> > >> >> >> then for P1+P2 +P3
>> > >> >> >> and finally for P1+P2+P3+P4
>> > >> >> >> why this is so?
>> > >> >> >> is there something wrong with the looping?
>> > >> >> >
>> > >> >> > I'm aware of something funny with the XML parsing, Bug 1970,
>> which
>> > >> >> might
>> > >> >> > well be the same issue:
>> > >> >> >
>> > >> >> > http://bugzilla.open-bio.org/show_bug.cgi?id=1970
>> > >> >> >
>> > >> >> > I confess I haven't looked into exactly what is going wrong
>> here
>> > >> - too
>> > >> >> > many other demands on my time to learn about XML and how
>> BioPython
>> > >> >> > parses it.
>> > >> >> >
>> > >> >> > Does the work around on the bug report help? Depending on
>> which
>> > >> >> version
>> > >> >> > of standalone blast you have installed, you might have better
>> > >> luck with
>> > >> >> > plain text output - the trouble is this is a moving target and
>> > >> the NBCI
>> > >> >> > keeps tweaking it.
>> > >> >> >
>> > >> >> > Peter
>> > >> >> >
>> > >> >> > _______________________________________________
>> > >> >> > BioPython mailing list - BioPython at lists.open-bio.org
>> > >> >> > http://lists.open-bio.org/mailman/listinfo/biopython
>> > >>
>> > >>
>> > >>
>> > >> _______________________________________________
>> > >> BioPython mailing list - BioPython at lists.open-bio.org
>> > >> http://lists.open-bio.org/mailman/listinfo/biopython
>> > >>
>> > >>
>> > >>
>> > >>
>> > _______________________________________________
>> > BioPython mailing list - BioPython at lists.open-bio.org
>> > http://lists.open-bio.org/mailman/listinfo/biopython
>> >
>>
More information about the Biopython
mailing list