[BioRuby] GSOC: phyloXML for BioRuby: Mapping sequence

Diana Jaunzeikare rozziite at gmail.com
Sat May 30 21:27:52 UTC 2009


Hi all,

So I looked more carefully at the sequence element of phyloXML and it
consists of information which cannot be mapped to Bio::Sequence object. I
suggest to have a sequence class which closely resembles phyloXML structure
and then have a method to extract relevant elements return Bio::Sequence
object.  What do you think?

Here on the left i listed phyloXML sequence tag elements and after the arrow
-> the possible corresponding attribute of Bio::Sequence
* type
** rna, dna  -> Bio::Sequence::NA -> molecule type
** aa -> Bio::Sequence::AA
* id_source (string ?) -> id_namespace
* id_ref (string ) -> entry_id
* symbol (string ?)
* accession
** source (example: "UniProtKB") ->
** id (example: "P17304") ->  primary_accession
* name (string )
* location (string ? )
* mol_seq (string) -> seq / Bio::Sequence::NA/AA
* uri
** desc (string)
** type (string )
** uri

* annotation []
** ref
** source
** evidence
** type
** desc
** confidence
** property []
** uri

* domain_architecture
** length
** domain []
*** from
*** to
*** confidence
*** id

Diana



More information about the BioRuby mailing list