[Biopython] SeqIO parser error

Peter biopython at maubp.freeserve.co.uk
Fri Sep 25 15:49:01 UTC 2009


On Fri, Sep 25, 2009 at 4:25 PM, Michael S. Koeris
<michael.koeris at gmail.com> wrote:
> Hi,
>
> I'm just getting acquainted with Sequence Objects and records and so forth.
> I tried some very basic code from the tutorial and I get an error when I run
> this:
>
> from Bio import Entrez, SeqIO
>
> gi_list = ['224589821', '224514694', '164698032', '157812089', '157734174']
> gi_str = ",".join(gi_list)
> handle = Entrez.efetch(db="nuccore", id=gi_str, rettype="gb")
>
> records = SeqIO.parse(handle, "gb")
>
> for record in records:
>    print "%s, length %i, with %i features" \
>          %(record.name, len(record), len(record.features))
>
> Traceback (most recent call last):
> ...
>    feature_start = cur_feature.sub_features[0].location.start
> AttributeError: 'PositionGap' object has no attribute 'start'
>
> Any help is most appreciated.

Hi Mike,

You have found Bug 2745. Do you fancy testing the proposed fix?
http://bugzilla.open-bio.org/show_bug.cgi?id=2745

As a workaround, you can ask the NCBI for full GenBank records, not
CONTIG records (use rettype="gbwithparts"). However, since these
are such large files (whole chromosomes) it might be better to download
the whole human genome via FTP instead...

Peter




More information about the Biopython mailing list