[Bioperl-l] SeqIO issue? EUtilities Cookbook
Phillip San Miguel
pmiguel at purdue.edu
Fri Mar 26 15:52:17 UTC 2010
Could someone tell me what I am doing wrong? This seems simple, but I
have not been able to get it to work.
I am trying to use the code provided at:
http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook#Retrieve_raw_data_records_from_GenBank.2C_save_raw_data_to_file.2C_then_parse_via_Bio::SeqIO
and modified to request gi228534658
The EUtilities downloads a record from genbank and SeqIO seems as if it
is parsing it, but also seems not to return anything.
Nothing is printed with I run the following script on a Solaris box
running perl 5.10.0 and bioperl 1.6.1:
#!/usr/bin/perl
use strict;
use warnings;
use Bio::SeqIO;
use Bio::DB::EUtilities;
my @ids;
push @ids, '228534658';
my $factory = Bio::DB::EUtilities->new(
-eutil => 'efetch',
-db => 'nucleotide',
-rettype => 'genbank',
-id => \@ids);
my $file = 'myseqs.gb';
# dump HTTP::Response content to a file (not retained in memory)
$factory->get_Response(-file => $file);
my $seqin = Bio::SeqIO->new(-file => $file,
-format => 'genbank');
while (my $seq = $seqin->next_seq) {
print "I see a sequence\n";
print $seq->species();
}
"myseqs.gb" does have content:
Seq-entry ::= seq {
id {
general {
db "gpid:36555" ,
tag
str "contig49313" } ,
genbank {
accession "EZ113652" ,
version 1 } ,
gi 228534658 } ,
descr {
title "TSA: Zea mays contig49313, mRNA sequence." ,
source {
genome genomic ,
org {
taxname "Zea mays" ,
db {
{
db "taxon" ,
tag
id 4577 } } ,
orgname {
name
binomial {
genus "Zea" ,
species "mays" } ,
lineage "Eukaryota; Viridiplantae; Streptophyta; Embryophyta;
Tracheophyta; Spermatophyta; Magnoliophyta; Liliopsida; Poales; Poaceae;
PACCAD clade; Panicoideae; Andropogoneae; Zea" ,
gcode 1 ,
mgcode 1 ,
div "PLN" } } } ,
molinfo {
biomol mRNA ,
tech tsa } ,
pub {
pub {
article {
title {
name "Deep sampling of the Palomero maize transcriptome by a
high
throughput strategy of pyrosequencing." } ,
authors {
names
std {
{
name
name {
last "Vega-Arreguin" ,
initials "J.C." } } ,
{
name
name {
last "Ibarra-Laclette" ,
initials "E." } } ,
{
name
name {
last "Jimenez-Moraila" ,
initials "B." } } ,
{
name
name {
last "Martinez" ,
initials "O." } } ,
{
name
name {
last "Vielle-Calzada" ,
initials "J.P." } } ,
{
name
name {
last "Herrera-Estrella" ,
initials "L." } } ,
{
name
name {
last "Herrera-Estrella" ,
initials "A." } } } } ,
from
journal {
title {
iso-jta "BMC Genomics" ,
ml-jta "BMC Genomics" ,
issn "1471-2164" ,
name "BMC genomics" } ,
imp {
date
std {
year 2009 ,
month 7 ,
day 6 } ,
volume "10" ,
issue "1" ,
pages "299" ,
language "ENG" ,
pubstatus aheadofprint ,
history {
{
pubstatus received ,
date
std {
year 2008 ,
month 12 ,
day 2 } } ,
{
pubstatus accepted ,
date
std {
year 2009 ,
month 7 ,
day 6 } } ,
{
pubstatus aheadofprint ,
date
std {
year 2009 ,
month 7 ,
day 6 } } ,
{
pubstatus other ,
date
std {
year 2009 ,
month 7 ,
day 8 ,
hour 9 ,
minute 0 } } ,
{
pubstatus pubmed ,
date
std {
year 2009 ,
month 7 ,
day 8 ,
hour 9 ,
minute 0 } } ,
{
pubstatus medline ,
date
std {
year 2009 ,
month 7 ,
day 8 ,
hour 9 ,
minute 0 } } } } } ,
ids {
pii "1471-2164-10-299" ,
doi "10.1186/1471-2164-10-299" ,
pubmed 19580677 } } ,
pmid 19580677 } } ,
pub {
pub {
sub {
authors {
names
std {
{
name
name {
last "Vega-Arreguin" ,
first "Julio" ,
initials "J.C." } } ,
{
name
name {
last "Ibarra-Laclette" ,
first "Enrique" ,
initials "E." } } ,
{
name
name {
last "Jimenez-Moraila" ,
first "Beatriz" ,
initials "B." } } ,
{
name
name {
last "Martinez" ,
first "Octavio" ,
initials "O." } } ,
{
name
name {
last "Vielle-Calzada" ,
first "Jean" ,
initials "J.Philippe." } } ,
{
name
name {
last "Herrera-Estrella" ,
first "Luis" ,
initials "L." } } ,
{
name
name {
last "Herrera-Estrella" ,
first "Alfredo" ,
initials "A." } } } ,
affil
std {
affil "Laboratorio Nacional de Genomica para la
Biodiversidad" ,
div "Cinvestav Campus Guanajuato" ,
city "Irapuato" ,
sub "Guanajuato" ,
country "Mexico" ,
street "Km 9.6 Libramiento Norte, Carretera Irapuato-Leon" ,
postal-code "36821" } } ,
medium other ,
date
std {
year 2009 ,
month 3 ,
day 23 } } } } ,
user {
type
str "GenomeProjectsDB" ,
data {
{
label
str "ProjectID" ,
data
int 36555 } ,
{
label
str "ParentID" ,
data
int 0 } } } ,
create-date
std {
year 2009 ,
month 5 ,
day 5 } ,
update-date
std {
year 2009 ,
month 7 ,
day 14 } } ,
inst {
repr raw ,
mol rna ,
length 450 ,
seq-data
ncbi2na
'77499DA7905DD417DCB7F1D538536238E08229108D89A87E2CDA6282DA3AD02
0524AE9C0D4154576794E0420BFA8E351A9ED347A504D3B6FE927E94E475EB17A52427227B820A
A21086117F7597EFB837ED2FB463AEF9F9E774052FD00FA0C1C803A521131212AFFB00D11CDD63
760CFF0'H } }
Maybe I am using the wrong format? This looks more like ASN than genbank
format to me.
Phillip
More information about the Bioperl-l
mailing list