[Biopython-dev] Additions to SeqUtils

Yair Benita Y.Benita at pharm.uu.nl
Tue Aug 5 03:20:02 EDT 2003


on 5/8/03 3:16, Mark Yeager at yeagerm at comcast.net wrote:

>> As promised a few days ago I submit code to be added to the SeqUtils module.
>> The modules include:
>> Codon adaptation index -> for DNA sequence
>> Protein analysis methods such as isoelectric point, molecular weight and
>> more. Take a look.
> 
> Hello Yair- I am just starting out (one week now) with learning Python and
> BioPython for small bioinformatics utilities. I am an old Fortran programmer
> so it is very new to me. I started to learn Perl and BioPerl but I could never
> make useful sense of code examples and I decided to go with Python instead.
> 
> To start with, I'd like to script adding additional info to flat file
> databases of proteins of interest. Your example of CAI would be a perfect
> starting point.
> 
> Is there a small example program to orient me to actually do something useful-
> to specify an accession number and lookup the sequence in a fasta file and
> then calculate the CAI? A working example I can play with but actually do
> something useful.
> 
> I am continuing to read through the tutorials but I have yet to make it to
> BioPython. There is probably something already there along these lines-
> perhaps you can point me to that?
> 
> Thanks very much for your contributions, best regards,
> 
> Mark Yeager

Hi Mark,
You can look in the test files for more examples. Here are a few lines which
can help you fetch a gene from Genbank and get the CAI.

from Bio.WWW import NCBI
from Bio import Fasta
import CodonUsage #make sure you put the module on your python path

# fetch a gene from genebank
aGene = 
NCBI.efetch('nucleotide',id='23113',seq_start=373,seq_stop=1113,rettype='fas
ta')

# set up the fasta parser to read it.
parser = Fasta.RecordParser()
iterator = Fasta.Iterator(aGene, parser)
record = iterator.next()

# create an instance of CodonAdaptationIndex
aGeneCai = CodonUsage.CodonAdaptationIndex()

# print the gene in fasta format
print record

# print the CAI for the gene using the cai_for_gene method.
# Note that the default Shart & Li Ecoli index is used when
# you don't specify a different index.
# look in the test_CodonUsage for an example on making your own index.
print "\nCodon adaptation index for the above gene: %.2f" %
aGeneCai.cai_for_gene(record.sequence)

-- 
Yair Benita
Pharmaceutical Proteomics
Faculty of Pharmacy
Utrecht University





More information about the Biopython-dev mailing list