Wed, 14 Mar 2001 15:43:08 -0500 (EST)
With what you are describing, it would make sense to try and
decouple the object which contains the blast data (an analysis object) and
the parsing of the report. We have a factory for creating
SeqAnalysisParsers which have a method next_feature which will return a
SeqFeature. This model isn't robust enough to really handle a
concatenation of multiple blast reports in one file because the
next_feature method won't distinguish between which report you are in. I
could imagine writing an analysis object which abstracted the notion of
similarity data - perhaps this is just a collection of SimilarityPairs and
having SeqAnalysisParserI have a next_analysis_object method, and
next_feature would be implemented in this Analysis object.
All that being said, it is definitely good work for you to see how you
would modify BPlite to parse XML - as a side note - we probably need to
choose an XML parser that we are going to use consistently across
bioperl (if one will generally solve all the problems). Brad Marshall
chose XML::Parser and XML::Parser::PerlSAX so I see no reason to choose a
different one, but XML-philes please correct me if there is reason to
I think we have some work to do in structure definitions before we are
ready to scale to multiple report and multiple format blast parsing.
On Wed, 14 Mar 2001, Wiepert, Mathieu wrote:
> So, what you are saying is that it's not there, it would be good if it was
> there, but certainly don't replace what is already there (never would have
> done that anyway). Should this be done in the same object then? Sounds
> like no one is supporting Blast or BPLite. One thing about BPlite is that
> it seems to parse GCG output as well as blastall output just fine. Blast
> was not doing it as well, though I thought it would since GCG is mentioned
> in the docs (it may and I could have had a SUE).
> I think that if BPLite were passed an xml file it could recognize that. As
> a first pass it should not be tough to populate the same structures already
> in BPLite. That is not taking advantage of the full power of XML, but that
> can be addressed later. As an exercise for myself I am going to see about
> taking in the xml and populating the BPLite objects, and see how that goes.
> I am pretty new to all this, so it will take me awhile, but if anything
> interesting comes of it I will submit it for review.
> Mathieu Wiepert
> Medical Information Resources
> Mayo Foundation
> (507) 266-2317
> -----Original Message-----
> From: Hilmar Lapp [mailto:email@example.com]
> Sent: Wednesday, March 14, 2001 11:51 AM
> To: Wiepert, Mathieu
> Cc: 'firstname.lastname@example.org'
> Subject: Re: [Bioperl-l] BPLite?
> "Wiepert, Mathieu" wrote:
> > I have a few questions about BPLite, and the Blast parsing process. Blast
> > can output the ASN.1 or XML, as I understand it. Do we have any parsers
> > that use that as a starting point, to then fill objects like the current
> > report, subject, and HSP objects, or the Blast object?
> We don't. I don't even know whether Ian's latest version of BPlite is
> capable of that. Parsing XML instead of text was discussed for a
> while, and I think the outcome basically was that it is very desirable
> to have that, but at the same time people will want to continue using
> text-format reports and be able to parse them, simply because they're
> very well human readable and people tend to look at the reports from
> time to time.
> <attention: don't read further if you don't like solicitations>
> Maybe I also take the opportunity to re-iterate that at present we
> have no-one committed to keeping BPlite abreast with Ian's development
> (Peter S volunteered to fix bugs as they appear), and we'd be glad if
> someone is willing to step up ...
> Hilmar Lapp email: email@example.com
> GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
> Bioperl-l mailing list
> Bioperl-l mailing list
Center for Human Genetics
Duke University Medical Center