[Bioperl-l] parsing entrezgene file (lost data)

Chris Fields cjfields at illinois.edu
Tue Jul 5 14:57:48 UTC 2011


Carne,

Using the latest Bio::ASN1::EntrezGene parser and bioperl-live from github, and changing the dumper module to Data::Dumper, I can sort of repeat this, but I see the accession attached to a URL only (in the ASN1 output it is in several places).  The Bio::Seq is populated with data, however, which makes me think these tags are not parsed for some reason, or the SeqIO parser is not catching them.  

Can you file this as a bug and attach the problematic EntrezGene data?  Not sure if it is in the ASN1 parser itself or the bioperl SeqIO parser.

chris

On Jul 3, 2011, at 9:38 PM, Carnë Draug wrote:

> Hi
> 
> I've been trying to get some data from an ASN.1 entrezgene file.
> However, I can't seem to access some of the data on the file.  I've
> read the Feature-annotations page on the wiki (even fixed a bug in
> there) but still nothing. So I used Data::Dumper to look at the Seq
> and Annotation objects and couldn't see it in there at all although
> it's on the original file (attached).
> 
> The data I want from the sequence are the ids "NM_002105" and
> "NP_002096" which show up several times on the file. However, when I
> do this:
> 
> use Data::Dump;
> use Bio::SeqIO;
> my $file = $ARGV[0];
> my $seqio_object = Bio::SeqIO->new(-file => $file, -format => 'entrezgene');
> my $seq_object = $seqio_object->next_seq;
> print Dumper($seq_object);
> 
> I can't find 002105 or 002096 anywhere on the output.
> 
> Am I doing something wrong? How can I solve this?
> 
> Thanks in advance,
> Carnë Draug
> <entrezgene>_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list