[BioPerl] Re: [Bioperl-l] gff_string on an HSPI object is not Bio::DB::GFF friendly

Aaron J.Mackey amackey at virginia.edu
Mon Jan 12 11:32:35 EST 2004


Actually, all I really need is a relative-to-absolute coordinate  
mapper, so that I can prepare input in relative coordinates, and feed  
it to a dbGFF database in absolute coordinates (which was what I was  
hoping load_gff.pl might now be doing automatically, given all your  
talk about GFF3).  I realize now, however, that relative coordinates is  
not in the purvue of the GFF3 spec, but is rather an application issue.

Thanks for all your thoughts,

-Aaron

On Jan 12, 2004, at 11:18 AM, Scott Cain wrote:

> Aaron,
>
> I really doubt that the current release of GBrowse supports relative
> coordinates as described by both you and Allen.  I have to say I'm not
> sure, because I am in the process of developing a set of test data.
>
> As for chado, it should actually be fairly easy to adapt it to work  
> with
> relative coordinates.  The main change (for me) would be in the gbrowse
> chado adaptor, which assumes that all features have as the 'srcfeature'
> the 'top' feature (ie, all features are directly laid on the
> chromosome/arm/contig/whatever).  The reason it does that is because
> that is the way that the fruitfly people use it, and so that was the
> data I had to develop the adaptor for.
>
> If having relative coordinates is something that would be useful for  
> you
> to use chado, let me know (and send me sample GFF3 data) and I will  
> work
> on it.  Otherwise, it will go in the TODO file.
>
> Thanks,
> Scott
>
> On Fri, 2004-01-09 at 17:22, Allen Day wrote:
>> We don't support this in the chado load_gff3.pl script, but it  
>> wouldn't be
>> very difficult to add handling of simple cases.  I am concerned though
>> about difficulties handling potential ambiguity wrt the strandedness  
>> of
>> relative coordinates.
>>
>> I assume by relative coordinates here, you mean you're describing a
>> feature's position in terms of the position of another feature which  
>> is
>> itself described in absolute coordinates (or is relative to a feature
>> which is).
>>
>> -Allen
>>
>>
>>
>> On Fri, 9 Jan 2004, Aaron J.Mackey wrote:
>>
>>> Hi Scott,
>>>
>>> Thanks for the quick reply, but that wasn't exactly the nature of the
>>> question; the question was whether (apart from Gap attributes), do
>>> gbrowse, BDGFF, and/or, specifically, load_gff.pl variants know the
>>> rest of GFF3, namely to provide the ability of input GFF3 with  
>>> features
>>> that aren't in absolute reference coordinates, but in relative
>>> coordinates?  And is that ability in release 1.58, or some CVS  
>>> branch I
>>> can access (code that lives quietly in the depths of Lincoln's hard
>>> drive doesn't count)?
>>>
>>> Thanks,
>>>
>>> -Aaron
>>>
>>> On Jan 9, 2004, at 4:47 PM, Scott Cain wrote:
>>>
>>>> OK, I am going to answer this, but if I am wrong, I'm sure Lincoln  
>>>> will
>>>> correct me.  I don't think gbrowse or BDGFF knows how to deal with
>>>> cigar
>>>> lines in Gap attributes yet.  It is safer for the moment to  
>>>> continue to
>>>> put separate HSPs on separate GFF lines for the time being.
>>>>
>>>> Scott
>>>>
>>>>
>>>> On Fri, 2004-01-09 at 16:42, Aaron J.Mackey wrote:
>>>>> Forgive me for a stupid question, but does GBrowse (v1.58) now  
>>>>> support
>>>>> GFF3?  Namely, can I have start/stops in sub-feature coordinates  
>>>>> in my
>>>>> input GFF3 and expect bp_load_gff.pl to behave properly (i.e.  
>>>>> generate
>>>>> "canonical" top-level coordinates for storage)?  I didn't see  
>>>>> anything
>>>>> in the documentation, so I was surprised to see some of the words  
>>>>> in
>>>>> these posts ...
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Jan 9, 2004, at 4:09 PM, Mark Wilkinson wrote:
>>>>>
>>>>>> Cool.  I'm heavily into making the HSP's output proper GFF3 today  
>>>>>> for
>>>>>> some of the Gbrowse tools that I have been working on, so I will  
>>>>>> jump
>>>>>> in
>>>>>> and do this over the next day or two.
>>>>>>
>>>>>> Cheers!
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On Fri, 2004-01-09 at 14:49, Scott Cain wrote:
>>>>>>> I think everything you wrote below is correct.  As far as I know,
>>>>>>> only
>>>>>>> Allen and I have been working BTGFF's GFF3 code, and we haven't
>>>>>>> touched
>>>>>>> the alignment portion, so I am not surprised that it is wrong.  I
>>>>>>> suppose fixing BTGFF may break some tools, but I know that the  
>>>>>>> chado
>>>>>>> loader I wrote will handle it correctly :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Scott
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 2004-01-09 at 15:45, Mark Wilkinson wrote:
>>>>>>>> On Fri, 2004-01-09 at 11:22, Scott Cain wrote:
>>>>>>>>
>>>>>>>>>   - be sure to use a SO term for the type (ie, match or one of  
>>>>>>>>> its
>>>>>>>>> children)
>>>>>>>>
>>>>>>>> So... actually the existing implementation of GFF3 in bioperl
>>>>>>>> from Bio::Tools::GFF->new(-gff_version => 3)
>>>>>>>> does not generate correctly formatted GFF3 for alignment  
>>>>>>>> features,
>>>>>>>> yeah?
>>>>>>>>
>>>>>>>> e.g. for column 9 of an alignment feature I get:
>>>>>>>>
>>>>>>>> 	Target=gi|2828774:54232..54206
>>>>>>>>
>>>>>>>> whereas I think I should be getting
>>>>>>>>
>>>>>>>> 	Target=gi|2828774+54232+54206
>>>>>>>>
>>>>>>>> In addition, it passes through all sorts of other tags that  
>>>>>>>> begin
>>>>>>>> with
>>>>>>>> capital letters:
>>>>>>>>
>>>>>>>> 	Bits=46.1;FracId=0.962962962962963
>>>>>>>>
>>>>>>>> these should be
>>>>>>>>
>>>>>>>> 	bits=46.1;fracId=0.962962962962963
>>>>>>>>
>>>>>>>> if I am reading the spec correctly.
>>>>>>>>
>>>>>>>> Finally, the column-3 term that comes out is "similarity", but  
>>>>>>>> it
>>>>>>>> should be
>>>>>>>> one of the *match terms.  Is that also correct?
>>>>>>>>
>>>>>>>> Please confirm that I am interpreting the GFF3 spec correctly  
>>>>>>>> for
>>>>>>>> these
>>>>>>>> Alignment features and I would be happy to go in and fix things
>>>>>>>> (a.k.a. break
>>>>>>>> everyone else's tools ;-) )
>>>>>>>>
>>>>>>>> Cheerio!
>>>>>>>>
>>>>>>>> Mark
>>>>>>>>
>>>>>> --   
>>>>>> Mark Wilkinson <markw at illuminae.com>
>>>>>> Illuminae
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>> --   
>>>> -------------------------------------------------------------------- 
>>>> ---
>>>> -
>>>> Scott Cain, Ph. D.
>>>> cain at cshl.org
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
> -- 
> ----------------------------------------------------------------------- 
> -
> Scott Cain, Ph. D.                                          
> cain at cshl.org
> GMOD Coordinator (http://www.gmod.org/)                      
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>



More information about the Bioperl-l mailing list