[Biojava-l] Rooted trees in nexus files

Richard Holland holland at eaglegenomics.com
Wed Nov 4 12:46:01 UTC 2009


>
> You have to change that method signature if you want to use the same  
> method. The only relationship between JGraphTs UndirectedGraph and  
> the DirectedGraph counterpart is that they both extend the Graph  
> interface, but a DirectedGraph is not an UndirectedGraph. Switching  
> to DirectedGraph definitely breaks the current API ! I don't know  
> how you usually handle such situations in BioJava, but this clearly  
> breaks compatibility. Maybe it would be better to introduce a new  
> method that returns directed graphs ?

Whether or not to break the API depends on a few things. First, how  
old and well adopted is the code. Second, is the existing API  
illogical or just plain wrong. A balance between the two gives the  
confidence in which the API can be changed.

In this instance, the code is fairly new, not widely adopted, and the  
existing API is clearly wrong by forcing all JGraphT graphs to be  
undirected.

To keep everyone happy, I would introduce a new method with a new name  
that takes a boolean or enum option indicating what type of graph the  
user wants (undirected,directed,whatever). I would then deprecate the  
existing method and move its contents into the undirected part of the  
new method, and replace the old method contents with a call to the new  
method with the option set to undirected.

cheers,
Richard

> cheers,
> -thasso
>
>
>
>
>
>
>>
>> Richard.
>>
>> On 3 Nov 2009, at 18:55, Tiago Antão wrote:
>>
>>> But the point is that the class interface changes to the outside  
>>> user:
>>> 1. How does one report back the root to the user?
>>> 2. Regarding the prefix stuff, should the user be allowed to  
>>> specify a
>>> preferred prefix?
>>>
>>> Both this things imply interface changes visible to users.
>>> If you still need volunteers to do the change, I can do it. But I  
>>> need
>>> to know what changes to the user interface are to be done.
>>> For 1, maybe a method getRoot, returning a string with the name of  
>>> the
>>> root node?
>>> For 2, maybe an extended version of the parse function with a suffix
>>> as input parameter?
>>>
>>> 2009/11/3 Richard Holland <holland at eaglegenomics.com>:
>>>>> 1. Lack of knowledge of root node
>>>>
>>>> The Newick tree string is read as-is and is not parsed. It only  
>>>> gets parsed
>>>> at the point of conversion to a Undirected or WeightedGraph  
>>>> inside the
>>>> TreeBlocks.java source code (inside the two types of get-As-JGraphT
>>>> methods). It's at this point the string is parsed and it's here  
>>>> that root
>>>> note determination should take place. It's already known whether  
>>>> &R or &U
>>>> have been specified here, which should help the code work out  
>>>> what to do.
>>>>
>>>>> 2. The p* stuff.
>>>>
>>>> Exactly the same part of the code as described above. Wherever it  
>>>> pushes
>>>> values to the stack but prepends them with 'p' first, you'll need  
>>>> to change
>>>> the 'p' to some instance variable and provide a getter/setter to  
>>>> change it,
>>>> with 'p' being the default setting.
>>>>
>>>> cheers,
>>>> Richard
>>>>
>>>>>
>>>>> Tiago
>>>>> 2009/11/3 Richard Holland <holland at eaglegenomics.com>:
>>>>>>
>>>>>> Agreed that there is a bug. Now all we need is someone to go in  
>>>>>> and fix
>>>>>> it!
>>>>>> :)
>>>>>>
>>>>>> cheers,
>>>>>> Richard
>>>>>>
>>>>>> On 3 Nov 2009, at 18:16, Tiago Antão wrote:
>>>>>>
>>>>>>> 2009/11/3 Thasso Griebel <thasso.griebel at uni-jena.de>:
>>>>>>>>
>>>>>>>> There is a way to uniquely  get a root from a newick string.  
>>>>>>>> Usually a
>>>>>>>> rooted newick is surrounded with brackets, which indicates  
>>>>>>>> the root as
>>>>>>>> the
>>>>>>>> highest node in the tree. For example:
>>>>>>>>
>>>>>>>> (A, (B,C))
>>>>>>>>
>>>>>>>
>>>>>>> Agree, it is quite easy to get the root of the tree from the  
>>>>>>> newick
>>>>>>> representation. But it should be done on parsing and returned  
>>>>>>> in some
>>>>>>> way by the parsing system. If the user has to do it again, it  
>>>>>>> means
>>>>>>> that the user has to parse it again just to know the root node.
>>>>>>>
>>>>>>>> I would also suggest to generally parse trees as rooted trees  
>>>>>>>> (maybe
>>>>>>>> jsut
>>>>>>>> for th initial internal model). Creating an unrooted tree  
>>>>>>>> from a rooted
>>>>>>>> one
>>>>>>>> is easy, remove the root and forget about directions. The  
>>>>>>>> other way
>>>>>>>> might
>>>>>>>> be
>>>>>>>> hard and ambiguous.
>>>>>>>
>>>>>>> 100% agree.
>>>>>>> The newick _representation_ always has a root by virtue of the  
>>>>>>> way it
>>>>>>> is done. If that root has meaning or not depends. Doing as you  
>>>>>>> suggest
>>>>>>> seems the most reasonable idea.
>>>>>>> I would add that even if it is an unrooted tree, the topology  
>>>>>>> might be
>>>>>>> of interest. In my case I am doing a comparative visualizer  
>>>>>>> and it
>>>>>>> might be nice for the user to be able to visualize the  
>>>>>>> topology as
>>>>>>> specified. It has no biological meaning, but in practice, for  
>>>>>>> many
>>>>>>> users, it helps.
>>>>>>> I note that PhyloXML (even by virtue of being a XML format)  
>>>>>>> always
>>>>>>> represents the phylogenies as trees (not weigthed DAGs). There  
>>>>>>> an
>>>>>>> attribute rooted which can be true or false.
>>>>>>>
>>>>>>> But, anyway. Even assuming a very conservative view on this, the
>>>>>>> current parser, for rooted trees, does not allow to determine  
>>>>>>> where is
>>>>>>> the root. I think that there would be a consensus that that is  
>>>>>>> a bug?
>>>>>>>
>>>>>>> Tiago
>>>>>>
>>>>>> --
>>>>>> Richard Holland, BSc MBCS
>>>>>> Operations and Delivery Director, Eagle Genomics Ltd
>>>>>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
>>>>>> http://www.eaglegenomics.com/
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> "The hottest places in hell are reserved for those who, in times  
>>>>> of
>>>>> moral crisis, maintain a neutrality." - Dante
>>>>
>>>> --
>>>> Richard Holland, BSc MBCS
>>>> Operations and Delivery Director, Eagle Genomics Ltd
>>>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
>>>> http://www.eaglegenomics.com/
>>>>
>>>>
>>>
>>>
>>>
>>> -- 
>>> "The hottest places in hell are reserved for those who, in times of
>>> moral crisis, maintain a neutrality." - Dante
>>
>> --
>> Richard Holland, BSc MBCS
>> Operations and Delivery Director, Eagle Genomics Ltd
>> T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
>> http://www.eaglegenomics.com/
>>
>
> --
> Dipl. Inf. Thasso Griebel-------------------Lehrstuhl fuer  
> Bioinformatik
> Office 3426--http://bio.informatik.uni-jena.de--Institut fuer  
> Informatik
> Phone +49 (0)3641 9-46454-----------Friedrich-Schiller-Universitaet  
> Jena
> Fax +49 (0)3641 9-46452----------Ernst-Abbe-Platz 2, 07743 Jena,  
> Germany
>
>
>

--
Richard Holland, BSc MBCS
Operations and Delivery Director, Eagle Genomics Ltd
T: +44 (0)1223 654481 ext 3 | E: holland at eaglegenomics.com
http://www.eaglegenomics.com/





More information about the Biojava-l mailing list