[MOBY-dev] Cleaning the registry

Andreas Groscurth groscurt at mpiz-koeln.mpg.de
Wed Jan 7 07:03:24 UTC 2009


 > My understanding is that data type is the heart of biomoby. If the 
data type is not carefully managed, the beauty of biomoby only works in 
theory.

Thats of course a way more broader topic you are talking about - this is 
not only cleaning unused datatypes but curating existing ones. Although 
I agree with you this comes with some problems. The main problem would 
be who to decide which datatype is the correct one or what is the truth 
in general...
And of course what to do with the numerous datatypes which might be the 
children of a "wrong" datatype - and with all the services which use 
them.... and so on.

> Actually, I don't. Or: I agree only partially. The data types can be part of
> a domain model (as they are for the GCP project) - even without existing
> services, the model is still useful (and, for example, with Moses generated
> data types can be at once used in various Java programs).

mhm sorry I dont get that point. Why are datatypes registered at the official production central of BioMoby if they are not used in BioMoby ? Just because they might be relevant for various Java programs ? Is the central then a collection of all domain models which are relevant for everything ? This sounds like abuse to me...
I guess I dont get that one, but still for me the central repository of BioMoby should only contain elements which are useable and relevant for the BioMoby world (This excludes those datatypes and services which are not reachable for the public)

Cheers
Andreas

jason wrote:
> Hi, all
>
> I think some manual clean is needed, too.
> For example, there are two text plain data type: text_plain and 
> text-plain. I guess there are no difference between these two types. 
> Another example is FASTA nucleotide sequence. What type is FASTA 
> nucleotide sequence ? It can be FASTA_NA or NucleotideSequence.  Here 
> a piece of data can be represented by two types and there is no 
> connection between these two types.   If the user thinks his data as 
> FASTA_NA and searches for service, he will miss the service for 
> NucleotideSequence.  Service producing FASTA_NA can not be chained to 
> service accepting NucleotideSequence as input.
> This brings up another question: whether uncurated data type 
> management works in reality or not. 
>
>
> -jason
>
> Andreas Groscurth wrote:
>> Hi all,
>>
>> I wrote a short script which basically fetches all namespaces and all 
>> datatypes registered at the Moby central in Canada. Both are then 
>> compared to all datatypes and namespaces of all registered services 
>> used for the input and the output definition.
>>
>> Assuming the retrieval methods in jmoby work correctly and my script 
>> does it also we have the following numbers:
>>
>> Registered Data Types: 721
>> Unused Data Types: 388
>>
>> Registered Namespaces: 459
>> Unused Namespaces: 232
>>
>> This means 53% of all registered datatypes are not used - and 50% of 
>> all Namespaces !!!
>>
>> How do you think about cleaning the registry once in a while and 
>> erase unsused datatypes and namespaces ?
>>
>> Of course they might be useable for a service provider someday, but 
>> for the sake of clarity I would suggest to do that. Cleaning both 
>> would reduce the number of entries by more than the half, which would 
>> result in smaller, more compact, better understandable and better 
>> browsable ontologies.
>>
>> What do you think ?
>>
>> Cheers
>> Andreas
>> _______________________________________________
>> MOBY-dev mailing list
>> MOBY-dev at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/moby-dev
>
>
> _______________________________________________
> MOBY-dev mailing list
> MOBY-dev at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/moby-dev


-- 
/***************************************************
  Dipl. Bioinf. Andreas Groscurth
  Software developer
  Plant Computational Biology group
  Max-Planck Institute for plant breeding research
  Carl-von-Linne Weg 10
  50829 Cologne
  Germany
  +49(0) 221 5062449
***************************************************/




More information about the MOBY-dev mailing list