[Bioperl-l] Thoughts on Bio::Tools::Glimmer

Thu Apr 12 18:35:00 UTC 2007

I'm willing to do the coding and testing, I'm just not familiar with  
the submission process yet (I see there's a HOWTO now, nice).   Let's  
discuss first.

So to reiterate, I'm suggesting that the module also parse out the  
frame and score information from Glimmer output.  I take back my  
suggestion of overriding the source / primary tags through the module  
as this can easily be done post-parser.  Other annotations can also  
be edited post-parser easily enough.

Reasons for:  Parsing everything out of the output and letting the  
user determine what's useful or not.

Reasons against:  Extra information may not be relevant to the format  
of the generated feature type?

-Andrew

On Apr 12, 2007, at 1:26 PM, Mark Johnson wrote:

>    I'd call that a buggy regexp.  Sounds like a good (but minimal)
> fix.  Torsten, I don't have cvs write access, I think you do, can you
> fix that up?  Andrew, can you file that as a bug:
>
> http://bugzilla.bioperl.org/
>
>    Everything else sounds like enhancements.  I'm not necessarily
> opposed, but a little discussion is probably in order before putting
> any tickets in for any of that.  Also, I'm not sure when I'll be able
> to spare some time to work on the module.  It was easy to justify
> spending time from my day job getting the module up to where is now,
> as I needed a BioPerl-ish glimmer2/glimmer3 parser.  It's working
> quite well for my purposes.  Again, I'm not opposed to further
> enhancements, but If I'm going to work on any of them, they'll have to
> fit into everything else I'm doing and it could be a while.  However,
> there's no reason somebody else can't do what I did.  Discuss the
> changes here, work out a plan, implement it, send along the diff(s)
> attached to a bug in bugzilla.  Next thing you know, your changes are
> in cvs.  8)
>
> On 4/11/07, Torsten Seemann  
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Andrew,
>>
>> >                 # Glimmer 3.X prediction
>> >                 (/\w+(\d+)\s+       # orf (numeric portion)
>> > ...isn't picking up more than the last digit in the orf-number.   
>> Not
>> > sure if that's intentional.  A sample of the feature output using -
>> >  >gff_string shows up as ...
>>
>> I think that regexp should be \w+?(\d+)
>>
>> ie. the \w+ should be non-greedy, otherwise it will swallow up all  
>> but
>> one of the following \d+ (as \d is a subset of \w)
>>
>> I've CC:ed this to Mark Johnson who made the recent changes to  
>> this module.
>>
>> Thanks for your feedback,
>>
>> --Torsten Seemann

--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852

email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270