[Biopython] How to read certain GEO files with Bio.Geo?

Sean Davis sdavis2 at mail.nih.gov
Thu Nov 14 21:06:25 UTC 2013


On Thu, Nov 14, 2013 at 3:27 PM, Ilya Flyamer <flyamer at gmail.com> wrote:

> Hello everyone!
>
> I have just recently posted a question on Stackoverflow here (
>
> http://stackoverflow.com/questions/19961582/how-to-read-certain-geo-files-with-bio-geo
> ),
> but I am not getting any answers there.
>
> I have a problem parsing a particular GEO file (accession number GSE40603).
> I do it according to the tutorial in this way:
>
> from Bio import Geo
> handle = open('GSE40603_combined_L1_L2.txt')
>

This file is a so-called "supplemental file" from GEO. It was supplied by
the original submitter, so tools to read GEO formats will not work with it.
In this particular case (NGS data), your best bet is to simply parse your
downloaded file with standard python tools.

Sean


> records = Geo.parse(handle)for record in records:
>     print record
>
> But I get an error:
>
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File
> "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py",
> line 585, in runfile
>     execfile(filename, namespace)
>   File "/home/ilya/Документы/biology/E coli GCC/GEOanalyzer.py", line
> 11, in <module>
>     for record in records:
>   File "/usr/local/lib/python2.7/dist-packages/Bio/Geo/__init__.py",
> line 60, in parse
>     record.table_rows.append(row)AttributeError: 'NoneType' object has
> no attribute 'table_rows'
>
> Here is the head of that file:
>
> 0   0   63  NC_000913   0   152 NC_000913   0   152 |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
> thrL  0   1   81  NC_000913   0   152 NC_000913   153 599 |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |gene gene= thrL  |CDS(+,190,255) gene= thrL  |gene gene= thrA
> |CDS(+,337,2799) gene= thrA  note= bifunctional: aspartokinase I
> (N-terminal); 0   2   1   NC_000913   0   152 NC_000913   600 698
> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
> thrL    |gene gene= thrA  |CDS[fcd=-312](+,337,2799) gene= thrA  note=
> bifunctional: aspartokinase I (N-terminal); 0   3   1   NC_000913   0
>  152 NC_000913   699 755 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
> |CDS[fcd=-390](+,337,2799) gene= thrA  note= bifunctional:
> aspartokinase I (N-terminal); 0   4   1   NC_000913   0   152
> NC_000913   756 757 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
> |CDS[fcd=-419](+,337,2799) gene= thrA  note= bifunctional:
> aspartokinase I (N-terminal); 0   2620    1   NC_000913   0   152
> NC_000913   352429  352483  |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= prpE
> |CDS[fcd=-526](+,351930,353816) gene= prpE  note= putative
> propionyl-CoA synthetase  0   18818   1   NC_000913   0   152
> NC_000913   2560323 2560384 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
> prophage Eut/CPZ-55  |gene gene= yffO
> |CDS[fcd=-220](+,2560133,2560549) gene= yffO  0   2617    1
> NC_000913   0   152 NC_000913   352326  352375  |neigh_up
> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
> |gene gene= prpE  |CDS[fcd=-420](+,351930,353816) gene= prpE  note=
> putative propionyl-CoA synthetase  0   18817   1   NC_000913   0   152
> NC_000913   2560275 2560322 |neigh_up NC_000913-start |neigh_down
> CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
> prophage Eut/CPZ-55  |gene gene= yffO
> |CDS[fcd=-165](+,2560133,2560549) gene= yffO  0   912 1   NC_000913
> 0   152 NC_000913   113055  113082  |neigh_up NC_000913-start
> |neigh_down CDS[fcd=114](+,190,255) gene= thrL    |gene gene= coaE
> |CDS[fcd=151](-,112599,113219) gene= coaE  note= putative DNA repair
> protein
>
> Am I doing something wrong? How do I read such files?
>
> Thank you in advance!
> Best,
>
> Ilya Flyamer
>
> _______________________________________________
> Biopython mailing list  -  Biopython at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/biopython
>




More information about the Biopython mailing list