[Biopython-dev] get CDS

Brad Chapman chapmanb at arches.uga.edu
Thu Feb 15 13:33:05 EST 2001


Hi Thomas;

> Is there an easy way to retrieve the coordinates for all CDS's (not the
> translations) in a given DNA sequence ? 

At least in my mind, finding putative CDSs is a hard job, so I don't
know if I can think of any easy ways :-). In your NextOrf function,
are you thinking of this in terms of finding an ORF within a cDNA,
where you don't have to worry about introns, etc -- or are you
focusing on bacterial stuff mostly? Just curious. This would at least
help me get an idea what exactly you are trying to do. Coming from a
Eukaryotic world, I guess I just see the problem of finding CDSs as
"pretty damn hard."

> Actually it is for the nextorf script - I'd like to include the bad-orf
> retrieving function where the user can choose if the ORF should be limited
> by start and stop codons (for good sequences) or if she wants to retrieve
> the longest possible coding regions within stop codons (as usually found in
> sequencing projects with raw sequence data).

In terms of locations, have you thought about using the new Location
model I put into CVS (it's in Bio/SeqFeature.py) to help deal with
GenBank and BioCorba locations? I'd love to get more feedback on it,
so people can decide if they think this does a good job of handling
locations. 
 
BTW, thanks for the new example code!

Brad




More information about the Biopython-dev mailing list