[Biopython] Is there any Biopython tool to degenerate a nucleotide sequence

Jeremy Jeremy.molbio at gmail.com
Fri Jul 31 05:39:00 UTC 2015


Carlos Pena <mycalesis <at> gmail.com> writes:

> 
> Dear Biopython members,
> 
> I want to take a nucleotide string and degenerate those bases that can
> undergo synonymous change.
> 
> For example, a string of just one codon.
> 
> * Input:  AAC
> * Output:  AAY
> 
> Since both AAC and AAT are translated to Asparagine (N) we can
> degenerate this codon to AAY (because the third position could produce a
> synonymous change).
> 
> This is already solved in the Perl library Degen
> http://www.phylotools.com/ptdegendocumentation.htm
> 
> I could use some glue to execute this Perl code from Python but
> I cannot include this library in my project because they are using the
> GPL license while I use BSD.
> 
> So I thought asking around before writing a Python script to do this for 
me.
> 
> thanks for any pointers,
> 
> carlos



Hi Carlos,

I hacked up something that should return the same output as the Degen 1.4 
Perlweb tool.  

The gist can be found here:  

https://gist.github.com/biojerm/6242381eb4ad3ef18ac6

I am pretty new to both Python and Biopython, so the please let me know if 
you have any feedback on both form, styling, and/or function. 

I know the method is currently quite fragile. Below are a few thoughts on 
the method's weaknesses

1)The method does not handle sequences that are not evenly divisible by 3.

2)I think the method would be a lot more useful if you could call it on a 
single or set of FASTA files or a GB files.  But, I have not learned  how 
to program that yet.  

3) I probably should return the degenerate sequences as Seq files, but at 
the moment they are simple strings.

4)Tests...need to figure those out too.  


Please let me know if you find this useful or and if there are any must 
have features for your purposes.

Thanks,
Jeremy





More information about the Biopython mailing list