[Biopython-dev] [Bug 2547] Translation of ambiguous codons like NNN and TAN

bugzilla-daemon at portal.open-bio.org bugzilla-daemon at portal.open-bio.org
Sun Jul 20 18:30:17 UTC 2008


http://bugzilla.open-bio.org/show_bug.cgi?id=2547


mmokrejs at ribosome.natur.cuni.cz changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mmokrejs at ribosome.natur.cuni
                   |                            |.cz




------- Comment #1 from mmokrejs at ribosome.natur.cuni.cz  2008-07-20 14:30 EST -------
Regarding the selenocystein issue, expect "inconsistencies" between data files
released from NCBI. I haven't check now but in 2002 I had the following
communication with NCBI staff:

GenBank format requires official IUPAC amino acid code that doesn't include
Selenocystein and therefore it uses 'X'.  FASTA format uses the NCBI extended 
amino acid code that does include Selenecystein 'U'.

> >gi_2983532 formate dehydrogenase alpha subunit [Aquifex aeolicus]
> MNYMDISRRGFLKLSVGSVGAGILGGLGFDLTPAYARVRDLKITKAKVTKSICPYCSVSCGILAYSLSDG
> AMNVKERIIHVEGNPDDPINRGTLCPKGATLRDFVNAPDRLTKPLYRPAGSTEWKEISWDEAIEKFARWV
> KDTRDRTFIHKDKAGRVVNRCDSIVWAVGSPLGNEEGWLMVKIGIALGLSARETQATIUHAPTVASLAPT
>                                   ------------------------^
[cut]
> 
> It seems there's buggy version in 
> ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/Aquifex_aeolicus/AE000657.faa 
> although the .gbk flatfile says "X" in case of "U".


-- 
Configure bugmail: http://bugzilla.open-bio.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.



More information about the Biopython-dev mailing list