[BioPython] script to extract records from nucleotide database

Christof Winter winter at biotec.tu-dresden.de
Tue Nov 13 20:25:45 UTC 2007


Matthew Abravanel wrote:
> Hi Christof,
> 
> I tried out the code you sent me just to see if it would work but I get an
> attribute error or something? Here is the error I get:
> 
> 
> Traceback (most recent call last):
>   File "./run", line 3, in ?
>     from Bio import GenBank
>   File "/usr/pkg/lib/python2.4/site-packages/Bio/GenBank/__init__.py", line
> 47, in ?
>   File "/usr/pkg/lib/python2.4/site-packages/Bio/GenBank/Scanner.py", line
> 20, in ?
>     from Bio.SeqRecord import SeqRecord
>   File "/usr/pkg/lib/python2.4/site-packages/Bio/SeqRecord.py", line 11, in
> ?
>   File "/usr/pkg/lib/python2.4/site-packages/Bio/FormatIO.py", line 55, in
> __init__
> AttributeError: 'module' object has no attribute 'formats'

Hi Matthew,

your import of the GenBank module fails. Most likely your BioPython installation 
  is broken. Could you try to re-install it?

On a Python (2.4) shell, this should work:

Python 2.4.4 (#2, Apr  5 2007, 20:11:18)
[GCC 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import Bio
 >>> import Bio.GenBank
 >>>

HTH,
Christof

> 
> 
> Here is the code I have used:
> 
> 
> 
> #!/usr/pkg/bin/python2.4
> 
> from Bio import GenBank
> 
> 
> featureParser = GenBank.FeatureParser()
> ncbiDict = GenBank.NCBIDictionary("nucleotide",
> "genbank",parser=featureParser)
> 
> accessionNumbers=["BC063166", "NM_028459"]
> 
> 
> for accessionNo in accessionNumbers:
>     giList = GenBank.search_for(accessionNo)
>     for gi in giList:
>         record = ncbiDict[gi]
>         for feature in record.features:
>             if feature.type =="CDS":
>                 codingStart = feature.location._start.position
>                 codingEnd = feature.location._end.position
>                 completeSequence = record.seq.tostring()
>                 fiveUTRSequence = completeSequence[:codingStart]
>                 codingSequence = completeSequence[codingStart:codingEnd]
>                 threeUTRSequence = completeSequence[codingEnd:]
>             if feature.type=="gene":
>                 geneName=feature.qualifiers['gene'][0]
> 
>         print "Found",gi,geneName,len(completeSequence)
> 
> 
> I do not know if it is a difference in python2.4 version or not? Any help
> would be appreciate, thanks.
> 
> Matthew
> _______________________________________________
> BioPython mailing list  -  BioPython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list