[BioPython] Random sequence

pan at uchicago.edu pan at uchicago.edu
Wed Jun 16 21:58:31 EDT 2004


You can make a random seq with one line of python code:

>>> import random

>>> ''.join([random.choice('AGTC') for x in range(10)]) 
'GGTTTCGGTA'

>>> ''.join([random.choice('AGTC') for x in range(10)]) 
'GCGGGTCCGT'

>>> ''.join([random.choice('AGTC') for x in range(10)]) 
'AAAAGCACTG'

Isn't it beautiful? 

pan








Quoting ashleigh smythe <absmythe at ucdavis.edu>:

> On Wed, 2004-06-16 at 07:45, Sebastian Bassi wrote:
> > Is there a way to generate a random DNA sequence with biopython?
> > If not, I could submit a function to do it, but before doing it, I'd 
> > want to see if its not already done.
> 
> Hi Sebastian.  I wasn't able to find a random sequence generator in the
> biopython modules so I wrote a simple little one of my own a few months
> ago- it only uses biopython modules to add the sequence to a
> biopython-parsed file.  It is quite ugly and brute force as I'm a
> beginner - I'd be curious to see what you come up with.  In case you are
> curious, here it is:
> 
> #This is designed to generate random DNA sequence data and add
> #it to the end of a biopython-parsed sequence record
> #in fasta format.
> #Modified 2-20 to just make random seq. data for a taxon,
> #rather than adding it onto the existing sequence.
> 
> import random
> import string
>                                                                              
>                                                                              
>                                                                              
>            
> def generate(n):                  #generate the dna sequence of n length
>     bases=['A', 'T', 'G', 'C']
>     dna_in_list=[]
>                                                                              
>                                              
>     while n > 0:
>         abase=random.choice(bases)
>                                                                              
>                                              
>         dna_in_list.append(abase)
>         n=n-1
>                                                                              
>                                              
>     dnastring=str(dna_in_list)     #format the list into a string.  
>     better_dnastring=string.join(string.split(dnastring),"") #Take
>     better2_dnastring=string.strip(better_dnastring)         #out
>     better3_dnastring=better2_dnastring.replace(',','')      #unwanted
>     better4_dnastring=better3_dnastring.replace(']','')      #characters
>     better5_dnastring=better4_dnastring.replace('[','')
>     better6_dnastring=better5_dnastring.replace("'",'')
> 
>     return better6_dnastring
>    
>  
> def add_seq(n):                #this is how start 
>     import sys                 #the program:seqgen.add_seq(file, n).
>     from Bio import Fasta                                     
>     parser=Fasta.RecordParser()
>     afile=open(file_to_add_to, 'r')
>     iterator=Fasta.Iterator(afile, parser)
>  
>     out_file=open('randomadded.nex', 'w')
>  
>     while 1:                   #loop through each record and add the new
>         seq_to_add=generate(n) #sequence
>         cur_record=iterator.next()
>         if cur_record is None:
>             break
>         title_and_seq=string.split(cur_record.title)
>         title='>' + title_and_seq[0] + '\n'
>         new_record=title + 'N' + seq_to_add
>         out_file.write(new_record)
>         out_file.write('\n')
> 
> 
> Ashleigh
> 
> _______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
> 




More information about the BioPython mailing list