[Bioperl-l] Re: Exonerate vulgar lines; SearchIO model
Ewan Birney
birney at ebi.ac.uk
Mon Sep 22 07:37:38 EDT 2003
On Sunday, September 21, 2003, at 02:07 pm, Jason Stajich wrote:
> On Sun, 21 Sep 2003, Ewan Birney wrote:
>
>>
>> I have added vulgar line parsing to the exonerate output. I made
>> a simpler model than the cigar line parsing of having just the M
>> state durations as being HSPs.
>
> cool! Is there a switch in SearchIO::exonerate to look for '^cigar'
> versus '^vulgar' lines?
>
Yup. The same parser can deal with either ^cigar or ^vulgar. I thought
that was going to be more robust in the long term
>>
>> (this will get checked into the main trunk)
>>
>>
>> This brings up an issue - should HSPs in the SearchIO objects
>> be ungapped or gapped? It looks as if gapped cases are allowed
>> - is this the case? Should we flag ungapped vs gapped (or is
>> this done already somehow?)
>>
>
> Basically both are allowed. That flag would be the gap count in the
> HSP
> I guess. In general they should be ungapped, but we handle 'small
> gaps' in
> the sense that FASTA or BLAST HSPs can contain gaps (whose location we
> do have access to by virture of the gap charater in homology line).
>
Ok. I guess I should make this a flag in the parser (ungapped HSPs or
gapped
HSPs...)
> In fact the HitI->gaps call only collects the count of all the gaps for
> the contained HSPs - ala exons on genomic DNA we might count gaps in
> the
> (cDNA) exon alignment but not the overall (intron) gaps introduced by
> the
> alignment. What do you think should be done?
>
>>
>> I am starting to grok more why this event passing system is useful
>> (abstracts out parsing from object creation, more graceful about
>> partial information etc) but it does seem... quite alot of
>> scaffolding...
>>
>
> I agree - now that it is sort gotten out there and we at least have
> something to evaluate, I am game for looking at some refactoring. Had
> to
> make it work first...
>
>
>>
>> I guess we should make SeqIO work like this at some point, but
>> that's definitely not in my critical path at the moment.
>>
> Ditto.
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list