[BioPython] import Standalone problems

Rohini Damle rohini.damle at gmail.com
Thu Jul 20 18:59:52 UTC 2006


Hi,
Now I used your updated Record.py, NCBIXML.py and NcbiStandalone.py
(all updated)
I am not getting that previous error.
BUT I am still not getting the desired output ...
Here is my code

blast_out = open("C:/Documents and Settings/rdamle/My
Documents/Rohini's Documents/Blast
Parsing/onlymouse4proteinblastout.xml", "r")

b_parser = NCBIXML.BlastParser()
b_iterator = NCBIStandalone.Iterator(blast_out, b_parser)
E_VALUE_THRESH = 22

for b_record in b_iterator :
             for alignment in b_record.alignments:
                    for hsp in alignment.hsps:
                        if (hsp.expect< E_VALUE_THRESH):
                                print b_record.query.split()[0]
                                print '****Alignment****'
                                print 'sequence:',
alignment.title.split()[0]


with this code I was expecting to get all the alignments with
hsp.expect<E_VALUE_THRESH

BUT I AM GETTING ALL the alignments not just the one with evalue <22
-Rohini.





On 7/20/06, Jacob Joseph <jmjoseph at andrew.cmu.edu> wrote:
> Hi.  I suspect you are not using my updated Record.py.   You'll notice
> that, at least for the moment, I have changed _blast.gap_penalties to an
> array to allow assignment per item without worrying about the order of
> entries within the xml file.  There are other ways this could be
> accomplished while still using a tuple.
>
> -Jacob
>
> Rohini Damle wrote:
> > Hi,
> > When I tried on your NCBIXML.py code instead of oringinal one I am
> > getting following error messege:
> >
> > File "C:\Python24\lib\site-packages\Bio\Blast\NCBIXML.py", line 210,
> > in _end_Parameters_gap_open
> >    self._blast.gap_penalties[0] = int(self._value)
> > TypeError: object does not support item assignment
> >
> > in the original version
> > we don't have that " [0] " in self._blast.gap_penalties
> >
> > what might be causing this error?
> > -Rohini
> >
> > On 7/19/06, Jacob Joseph <jmjoseph at andrew.cmu.edu> wrote:
> >> I do not believe the current version of the parser will work with
> >> multiple queries using recent version of blast, regardless of the output
> >> format.  I do know that blastall 2.2.13 with XML functions with the
> >> parser corrections previously attached.  I have attached a further
> >> updated NCBIXML.py, fixing the performance issues in parse() that I
> >> mentioned.
> >>
> >> -Jacob
> >>
> >> Rohini Damle wrote:
> >> > Hi,
> >> > Can someone suggest me for which version of Blast, the Biopython's
> >> > (text or xml) parser works fine?
> >> > I will download that blast version locally and can use biopython's
> >> parser.
> >> > thanx,
> >> > Rohini
> >> >
> >> > On 7/18/06, Jacob Joseph <jacob at jjoseph.org> wrote:
> >> >> Hi.
> >> >> I encountered similar difficulties over the past few days myself and
> >> >> have made some improvements to the XML parser.  Well, that is, it now
> >> >> functions with blastall, but I have made no effort to parse the other
> >> >> blast programs.  I do not expect I have done any harm to other
> >> parsing,
> >> >> however.
> >> >>
> >> >> Attached are Record.py, NCBIStandalone.py, and NCBIXML.py.  I have not
> >> >> yet spent significant time to clean up my changes.  Without getting
> >> into
> >> >> specific modifications, I have made an effort to make consistent the
> >> >> variables in Record and NCBIXML, focusing primarily on what I needed
> >> >> this week.
> >> >>
> >> >> One portion I am not settled on reinitialization of Record.Blast at
> >> >> every call to iterator.next(), and, by extension, BlastParser.parse().
> >> >> See NCBIXML.py, line 114.  Without re-initializing this class, we run
> >> >> the risk of retaining portions of a Record from previously parsed
> >> >> queries.   This causes the bug 1970, mentioned below.  Unfortunately,
> >> >> this re-initialization exacts a significant performance penalty of at
> >> >> least a factor of 10 by some rough measures.  I would appreciate any
> >> >> suggestions for improvement here.
> >> >>
> >> >> I do apologize for not being more specific about my changes.  When
> >> I get
> >> >> a chance(next week?), I will package them up as a proper patch and
> >> file
> >> >> a bug.  Perhaps what I have done so far will be of use until then.
> >> >>
> >> >> fyi, I have done all of my testing with Blast 2.2.13.  2.2.14 seems to
> >> >> not have separate <?xml> blocks within its output, requiring a
> >> different
> >> >> method of iteration.
> >> >>
> >> >> -Jacob
> >> >>
> >> >> Peter wrote:
> >> >> > Rohini Damle wrote:
> >> >> >> Hi,
> >> >> >> I have a XML file with 4 blast records (for proteins P1, P2, P3,
> >> P4)
> >> >> >> I am trying to extract alignment information for each of them.
> >> >> >> So I wrote the following code:
> >> >> >>
> >> >> >>  for b_record in b_iterator :
> >> >> >>
> >> >> >>                 E_VALUE_THRESH =20
> >> >> >>                 for alignment in b_record.alignments:
> >> >> >>                        for hsp in alignment.hsps:
> >> >> >>                        if hsp.expect< E_VALUE_THRESH:
> >> >> >>
> >> >> >>                             print '****Alignment****'
> >> >> >>                             print 'sequence:',
> >> >> alignment.title.split()[0]
> >> >> >>
> >> >> >> With this code, I am getting information for P1,
> >> >> >> then information for P1 + P2
> >> >> >> then for P1+P2 +P3
> >> >> >> and finally for P1+P2+P3+P4
> >> >> >> why this is so?
> >> >> >> is there something wrong with the looping?
> >> >> >
> >> >> > I'm aware of something funny with the XML parsing, Bug 1970, which
> >> >> might
> >> >> > well be the same issue:
> >> >> >
> >> >> > http://bugzilla.open-bio.org/show_bug.cgi?id=1970
> >> >> >
> >> >> > I confess I haven't looked into exactly what is going wrong here
> >> - too
> >> >> > many other demands on my time to learn about XML and how BioPython
> >> >> > parses it.
> >> >> >
> >> >> > Does the work around on the bug report help?  Depending on which
> >> >> version
> >> >> > of standalone blast you have installed, you might have better
> >> luck with
> >> >> > plain text output - the trouble is this is a moving target and
> >> the NBCI
> >> >> > keeps tweaking it.
> >> >> >
> >> >> > Peter
> >> >> >
> >> >> > _______________________________________________
> >> >> > BioPython mailing list  -  BioPython at lists.open-bio.org
> >> >> > http://lists.open-bio.org/mailman/listinfo/biopython
> >>
> >>
> >>
> >> _______________________________________________
> >> BioPython mailing list  -  BioPython at lists.open-bio.org
> >> http://lists.open-bio.org/mailman/listinfo/biopython
> >>
> >>
> >>
> >>
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>



More information about the Biopython mailing list