[Bioperl-l] SwissProt/UniProt GN line format changed

Ewan Birney birney at ebi.ac.uk
Sat Jul 31 14:05:41 EDT 2004



On Sat, 31 Jul 2004, Hilmar Lapp wrote:

> I added the capability to parse this and a test to the main trunk.
>
> The parser will write out the GN line(s) in the old format though. Does
> anybody have the requirement at this point that the parser write the
> new format?
>
> The problem with writing the new format is that that would require some
> additions to the object model, because it would require context for the
> individual annotation values ('Name', 'Synonym', 'OrderedLocusName', or
> ORFName). Presently, annotation values do not have context in Bioperl.
>

Or rather I think we would have to make a new Annotation type, being
something like 'MultiTaggedValue' which would have tags 'Name', 'Synonym'
etc and then the 'gene_name' annotation key would give a list of
'MultiTaggedValues' --- presumably with some magic to detect "old style"
simplevalue tags as well.




I might need this in the ensembl-bioperl bridge, so I'll keep this in
mind.


> 	-hilmar
>
> On Wednesday, July 21, 2004, at 05:12  PM, Boris Lenhard wrote:
>
> > I do not know if it has been discussed yet, but the GN (gene name) line
> > format recent versions of SwissProt files has been changed:
> >
> > e.g. old:
> >
> > GN   ZNF36 OR KOX18 OR ZNF139.
> >
> > new:
> >
> > GN   Name=RCHY1; Synonyms=ZNF363, CHIMP, ARNIP;
> >
> >
> > This renders Bio::SeqIO::swiss unable to parse the GN line; as a
> > consequence, the resulting annotation object lacks the 'gene_name' key.
> >
> > Boris
> >
> > --
> >
> > ==========================================
> >  Boris Lenhard, Ph.D.
> >  Group Leader, Applied Genome Informatics
> >  Center for Genomics and Bioinformatics
> >  Karolinska Institutet
> >  Berzelius väg 35, B326b
> >  171 77 Stockholm, SWEDEN
> >  Phone: +46 (0)8 5248 6391
> >  FAX: +46 (0)8 32 48 26
> >  E-mail: Boris.Lenhard at cgb.ki.se
> > ==========================================
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
> >
> --
> -------------------------------------------------------------
> Hilmar Lapp                            email: lapp at gnf.org
> GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
> -------------------------------------------------------------
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>




More information about the Bioperl-l mailing list