[Biopython] How to read certain GEO files with Bio.Geo?

Ilya Flyamer flyamer at gmail.com
Fri Nov 15 17:20:10 UTC 2013


Thank you, Sean!

This is very helpful!

Best wishes,
Ilya


2013/11/15 Sean Davis <sdavis2 at mail.nih.gov>

>
>
>
> On Thu, Nov 14, 2013 at 3:27 PM, Ilya Flyamer <flyamer at gmail.com> wrote:
>
>> Hello everyone!
>>
>> I have just recently posted a question on Stackoverflow here (
>>
>> http://stackoverflow.com/questions/19961582/how-to-read-certain-geo-files-with-bio-geo
>> ),
>> but I am not getting any answers there.
>>
>> I have a problem parsing a particular GEO file (accession number
>> GSE40603).
>> I do it according to the tutorial in this way:
>>
>> from Bio import Geo
>> handle = open('GSE40603_combined_L1_L2.txt')
>>
>
> This file is a so-called "supplemental file" from GEO. It was supplied by
> the original submitter, so tools to read GEO formats will not work with it.
> In this particular case (NGS data), your best bet is to simply parse your
> downloaded file with standard python tools.
>
> Sean
>
>
>> records = Geo.parse(handle)for record in records:
>>
>>     print record
>>
>> But I get an error:
>>
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>>   File
>> "/usr/lib/python2.7/dist-packages/spyderlib/widgets/externalshell/sitecustomize.py",
>> line 585, in runfile
>>     execfile(filename, namespace)
>>   File "/home/ilya/Документы/biology/E coli GCC/GEOanalyzer.py", line
>> 11, in <module>
>>     for record in records:
>>   File "/usr/local/lib/python2.7/dist-packages/Bio/Geo/__init__.py",
>> line 60, in parse
>>     record.table_rows.append(row)AttributeError: 'NoneType' object has
>>
>> no attribute 'table_rows'
>>
>> Here is the head of that file:
>>
>> 0   0   63  NC_000913   0   152 NC_000913   0   152 |neigh_up
>> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
>> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
>> thrL  0   1   81  NC_000913   0   152 NC_000913   153 599 |neigh_up
>> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
>> |gene gene= thrL  |CDS(+,190,255) gene= thrL  |gene gene= thrA
>> |CDS(+,337,2799) gene= thrA  note= bifunctional: aspartokinase I
>> (N-terminal); 0   2   1   NC_000913   0   152 NC_000913   600 698
>> |neigh_up NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene=
>> thrL    |gene gene= thrA  |CDS[fcd=-312](+,337,2799) gene= thrA  note=
>> bifunctional: aspartokinase I (N-terminal); 0   3   1   NC_000913   0
>>  152 NC_000913   699 755 |neigh_up NC_000913-start |neigh_down
>> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
>> |CDS[fcd=-390](+,337,2799) gene= thrA  note= bifunctional:
>> aspartokinase I (N-terminal); 0   4   1   NC_000913   0   152
>> NC_000913   756 757 |neigh_up NC_000913-start |neigh_down
>> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= thrA
>> |CDS[fcd=-419](+,337,2799) gene= thrA  note= bifunctional:
>> aspartokinase I (N-terminal); 0   2620    1   NC_000913   0   152
>> NC_000913   352429  352483  |neigh_up NC_000913-start |neigh_down
>> CDS[fcd=114](+,190,255) gene= thrL    |gene gene= prpE
>> |CDS[fcd=-526](+,351930,353816) gene= prpE  note= putative
>> propionyl-CoA synthetase  0   18818   1   NC_000913   0   152
>> NC_000913   2560323 2560384 |neigh_up NC_000913-start |neigh_down
>> CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
>> prophage Eut/CPZ-55  |gene gene= yffO
>> |CDS[fcd=-220](+,2560133,2560549) gene= yffO  0   2617    1
>> NC_000913   0   152 NC_000913   352326  352375  |neigh_up
>> NC_000913-start |neigh_down CDS[fcd=114](+,190,255) gene= thrL
>> |gene gene= prpE  |CDS[fcd=-420](+,351930,353816) gene= prpE  note=
>> putative propionyl-CoA synthetase  0   18817   1   NC_000913   0   152
>> NC_000913   2560275 2560322 |neigh_up NC_000913-start |neigh_down
>> CDS[fcd=114](+,190,255) gene= thrL    |misc_feature note= cryptic
>> prophage Eut/CPZ-55  |gene gene= yffO
>> |CDS[fcd=-165](+,2560133,2560549) gene= yffO  0   912 1   NC_000913
>> 0   152 NC_000913   113055  113082  |neigh_up NC_000913-start
>> |neigh_down CDS[fcd=114](+,190,255) gene= thrL    |gene gene= coaE
>> |CDS[fcd=151](-,112599,113219) gene= coaE  note= putative DNA repair
>> protein
>>
>> Am I doing something wrong? How do I read such files?
>>
>> Thank you in advance!
>> Best,
>>
>> Ilya Flyamer
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/biopython
>>
>
>




More information about the Biopython mailing list