[Bioperl-l] parsing entrezgene file (lost data)

Carnë Draug carandraug+dev at gmail.com
Tue Jul 5 22:38:08 UTC 2011


Well, I update the bug report with what you found, thank you.

2011/7/5 Smithies, Russell <Russell.Smithies at agresearch.co.nz>:
> Bio::ASN1::EntrezGene is not the easiest to work with but you can access everything if you try hard enough.
> I used it last year from transforming ASN.1 gene records from NCBI into fully annotated Wiki pages and it was very successful though I got sick of typing so many curly brackets ;-)

You mean I should access the data "manually" rather than using
methods? It will have to do by now although that's kind of the
opposite of objects are meant to (I think, I'm no programmer).

My plan is to make an application that can be reused by other people
hence trying to do it in a nice maintainable way without too many
hacks and why I can't just parse the gene2refseq file.

Since what I want is to get the transcripts and proteins given a gene
UID, I can see two options.
  1 - parse the ASN1 file and access the data 'manually' until this is
fixed (and then fix the code to use the methods)
  2 - use elink from EUtilities. But since it fails around half the
times, I'd have to check if it's a pseudo gene first. If it's not it
should link to at least one place in the nucleotide database so I'd
have the connection on an eval block until an id is returned.

I think I'll go for the first option but opinions are welcome.

Carnë




More information about the Bioperl-l mailing list