[Bioperl-l] Thoughts on Bio::Tools::Glimmer
Andrew Stewart
stewarta at nmrc.navy.mil
Thu Apr 12 18:35:00 UTC 2007
I'm willing to do the coding and testing, I'm just not familiar with
the submission process yet (I see there's a HOWTO now, nice). Let's
discuss first.
So to reiterate, I'm suggesting that the module also parse out the
frame and score information from Glimmer output. I take back my
suggestion of overriding the source / primary tags through the module
as this can easily be done post-parser. Other annotations can also
be edited post-parser easily enough.
Reasons for: Parsing everything out of the output and letting the
user determine what's useful or not.
Reasons against: Extra information may not be relevant to the format
of the generated feature type?
-Andrew
On Apr 12, 2007, at 1:26 PM, Mark Johnson wrote:
> I'd call that a buggy regexp. Sounds like a good (but minimal)
> fix. Torsten, I don't have cvs write access, I think you do, can you
> fix that up? Andrew, can you file that as a bug:
>
> http://bugzilla.bioperl.org/
>
> Everything else sounds like enhancements. I'm not necessarily
> opposed, but a little discussion is probably in order before putting
> any tickets in for any of that. Also, I'm not sure when I'll be able
> to spare some time to work on the module. It was easy to justify
> spending time from my day job getting the module up to where is now,
> as I needed a BioPerl-ish glimmer2/glimmer3 parser. It's working
> quite well for my purposes. Again, I'm not opposed to further
> enhancements, but If I'm going to work on any of them, they'll have to
> fit into everything else I'm doing and it could be a while. However,
> there's no reason somebody else can't do what I did. Discuss the
> changes here, work out a plan, implement it, send along the diff(s)
> attached to a bug in bugzilla. Next thing you know, your changes are
> in cvs. 8)
>
> On 4/11/07, Torsten Seemann
> <torsten.seemann at infotech.monash.edu.au> wrote:
>> Andrew,
>>
>> > # Glimmer 3.X prediction
>> > (/\w+(\d+)\s+ # orf (numeric portion)
>> > ...isn't picking up more than the last digit in the orf-number.
>> Not
>> > sure if that's intentional. A sample of the feature output using -
>> > >gff_string shows up as ...
>>
>> I think that regexp should be \w+?(\d+)
>>
>> ie. the \w+ should be non-greedy, otherwise it will swallow up all
>> but
>> one of the following \d+ (as \d is a subset of \w)
>>
>> I've CC:ed this to Mark Johnson who made the recent changes to
>> this module.
>>
>> Thanks for your feedback,
>>
>> --Torsten Seemann
--
Andrew Stewart
Research Assistant, Genomics Team
Navy Medical Research Center (NMRC)
Biological Defense Research Directorate (BDRD)
BDRD Annex
12300 Washington Avenue, 2nd Floor
Rockville, MD 20852
email: stewarta at nmrc.navy.mil
phone: 301-231-6700 Ext 270
More information about the Bioperl-l
mailing list