[BioPerl] Re: [Bioperl-l] gff_string on an HSPI object is not
Bio::DB::GFF friendly
Aaron J.Mackey
amackey at virginia.edu
Mon Jan 12 11:32:35 EST 2004
Actually, all I really need is a relative-to-absolute coordinate
mapper, so that I can prepare input in relative coordinates, and feed
it to a dbGFF database in absolute coordinates (which was what I was
hoping load_gff.pl might now be doing automatically, given all your
talk about GFF3). I realize now, however, that relative coordinates is
not in the purvue of the GFF3 spec, but is rather an application issue.
Thanks for all your thoughts,
-Aaron
On Jan 12, 2004, at 11:18 AM, Scott Cain wrote:
> Aaron,
>
> I really doubt that the current release of GBrowse supports relative
> coordinates as described by both you and Allen. I have to say I'm not
> sure, because I am in the process of developing a set of test data.
>
> As for chado, it should actually be fairly easy to adapt it to work
> with
> relative coordinates. The main change (for me) would be in the gbrowse
> chado adaptor, which assumes that all features have as the 'srcfeature'
> the 'top' feature (ie, all features are directly laid on the
> chromosome/arm/contig/whatever). The reason it does that is because
> that is the way that the fruitfly people use it, and so that was the
> data I had to develop the adaptor for.
>
> If having relative coordinates is something that would be useful for
> you
> to use chado, let me know (and send me sample GFF3 data) and I will
> work
> on it. Otherwise, it will go in the TODO file.
>
> Thanks,
> Scott
>
> On Fri, 2004-01-09 at 17:22, Allen Day wrote:
>> We don't support this in the chado load_gff3.pl script, but it
>> wouldn't be
>> very difficult to add handling of simple cases. I am concerned though
>> about difficulties handling potential ambiguity wrt the strandedness
>> of
>> relative coordinates.
>>
>> I assume by relative coordinates here, you mean you're describing a
>> feature's position in terms of the position of another feature which
>> is
>> itself described in absolute coordinates (or is relative to a feature
>> which is).
>>
>> -Allen
>>
>>
>>
>> On Fri, 9 Jan 2004, Aaron J.Mackey wrote:
>>
>>> Hi Scott,
>>>
>>> Thanks for the quick reply, but that wasn't exactly the nature of the
>>> question; the question was whether (apart from Gap attributes), do
>>> gbrowse, BDGFF, and/or, specifically, load_gff.pl variants know the
>>> rest of GFF3, namely to provide the ability of input GFF3 with
>>> features
>>> that aren't in absolute reference coordinates, but in relative
>>> coordinates? And is that ability in release 1.58, or some CVS
>>> branch I
>>> can access (code that lives quietly in the depths of Lincoln's hard
>>> drive doesn't count)?
>>>
>>> Thanks,
>>>
>>> -Aaron
>>>
>>> On Jan 9, 2004, at 4:47 PM, Scott Cain wrote:
>>>
>>>> OK, I am going to answer this, but if I am wrong, I'm sure Lincoln
>>>> will
>>>> correct me. I don't think gbrowse or BDGFF knows how to deal with
>>>> cigar
>>>> lines in Gap attributes yet. It is safer for the moment to
>>>> continue to
>>>> put separate HSPs on separate GFF lines for the time being.
>>>>
>>>> Scott
>>>>
>>>>
>>>> On Fri, 2004-01-09 at 16:42, Aaron J.Mackey wrote:
>>>>> Forgive me for a stupid question, but does GBrowse (v1.58) now
>>>>> support
>>>>> GFF3? Namely, can I have start/stops in sub-feature coordinates
>>>>> in my
>>>>> input GFF3 and expect bp_load_gff.pl to behave properly (i.e.
>>>>> generate
>>>>> "canonical" top-level coordinates for storage)? I didn't see
>>>>> anything
>>>>> in the documentation, so I was surprised to see some of the words
>>>>> in
>>>>> these posts ...
>>>>>
>>>>> Thanks
>>>>>
>>>>> On Jan 9, 2004, at 4:09 PM, Mark Wilkinson wrote:
>>>>>
>>>>>> Cool. I'm heavily into making the HSP's output proper GFF3 today
>>>>>> for
>>>>>> some of the Gbrowse tools that I have been working on, so I will
>>>>>> jump
>>>>>> in
>>>>>> and do this over the next day or two.
>>>>>>
>>>>>> Cheers!
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>> On Fri, 2004-01-09 at 14:49, Scott Cain wrote:
>>>>>>> I think everything you wrote below is correct. As far as I know,
>>>>>>> only
>>>>>>> Allen and I have been working BTGFF's GFF3 code, and we haven't
>>>>>>> touched
>>>>>>> the alignment portion, so I am not surprised that it is wrong. I
>>>>>>> suppose fixing BTGFF may break some tools, but I know that the
>>>>>>> chado
>>>>>>> loader I wrote will handle it correctly :-)
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Scott
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 2004-01-09 at 15:45, Mark Wilkinson wrote:
>>>>>>>> On Fri, 2004-01-09 at 11:22, Scott Cain wrote:
>>>>>>>>
>>>>>>>>> - be sure to use a SO term for the type (ie, match or one of
>>>>>>>>> its
>>>>>>>>> children)
>>>>>>>>
>>>>>>>> So... actually the existing implementation of GFF3 in bioperl
>>>>>>>> from Bio::Tools::GFF->new(-gff_version => 3)
>>>>>>>> does not generate correctly formatted GFF3 for alignment
>>>>>>>> features,
>>>>>>>> yeah?
>>>>>>>>
>>>>>>>> e.g. for column 9 of an alignment feature I get:
>>>>>>>>
>>>>>>>> Target=gi|2828774:54232..54206
>>>>>>>>
>>>>>>>> whereas I think I should be getting
>>>>>>>>
>>>>>>>> Target=gi|2828774+54232+54206
>>>>>>>>
>>>>>>>> In addition, it passes through all sorts of other tags that
>>>>>>>> begin
>>>>>>>> with
>>>>>>>> capital letters:
>>>>>>>>
>>>>>>>> Bits=46.1;FracId=0.962962962962963
>>>>>>>>
>>>>>>>> these should be
>>>>>>>>
>>>>>>>> bits=46.1;fracId=0.962962962962963
>>>>>>>>
>>>>>>>> if I am reading the spec correctly.
>>>>>>>>
>>>>>>>> Finally, the column-3 term that comes out is "similarity", but
>>>>>>>> it
>>>>>>>> should be
>>>>>>>> one of the *match terms. Is that also correct?
>>>>>>>>
>>>>>>>> Please confirm that I am interpreting the GFF3 spec correctly
>>>>>>>> for
>>>>>>>> these
>>>>>>>> Alignment features and I would be happy to go in and fix things
>>>>>>>> (a.k.a. break
>>>>>>>> everyone else's tools ;-) )
>>>>>>>>
>>>>>>>> Cheerio!
>>>>>>>>
>>>>>>>> Mark
>>>>>>>>
>>>>>> --
>>>>>> Mark Wilkinson <markw at illuminae.com>
>>>>>> Illuminae
>>>>>>
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l at portal.open-bio.org
>>>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>>>
>>>> --
>>>> --------------------------------------------------------------------
>>>> ---
>>>> -
>>>> Scott Cain, Ph. D.
>>>> cain at cshl.org
>>>> GMOD Coordinator (http://www.gmod.org/)
>>>> 216-392-3087
>>>> Cold Spring Harbor Laboratory
>>>>
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at portal.open-bio.org
>>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at portal.open-bio.org
>>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>>
> --
> -----------------------------------------------------------------------
> -
> Scott Cain, Ph. D.
> cain at cshl.org
> GMOD Coordinator (http://www.gmod.org/)
> 216-392-3087
> Cold Spring Harbor Laboratory
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list