[BioPython] Big GenBank files
aurelie.bornot at free.fr
aurelie.bornot at free.fr
Mon Apr 25 05:42:29 EDT 2005
Hi !
I am trying to make a program that do automatically blasts of a base of
sequences against the genbank sequences. And I would like to retrieve (also
automatically) the most interesting GenBank files..... to keep informations
about them in my database.
But I've got a problem (again..sorry ! :'( ) :
I've 2*512 Mega of RAM but it seems that my computer can't deal with 'big'
GenBank files like 'BA000028.3'(7 M) or 'AP008212' (37 M)
for example :
fichier = open('AP008212.fasta',"w")
record_parser = GenBank.RecordParser()
ncbi_dict = GenBank.NCBIDictionary ('nucleotide','genbank',parser=record_parser)
gb_record = ncbi_dict['AP008212']
fichier.close()
...never ends...
I suppose it is because the files are to big for the algo of the transformation
in registry....
For 'AP008212' (37 M) :
ncbi_dict = GenBank.NCBIDictionary ('nucleotide','fasta')
doesn't works either...
I tried to understand how all this works to try to retrieve the header of the
connexion (maybe there is a possibility of give up the download of these big
files...) but I am not very used to python and to all that concern
connexions...
I have been on this problem for 3 days... and I am lost...
I don't known what to do...
Could someone help me ?!
Thanks !
Aurelie
More information about the BioPython
mailing list