[Bioperl-l] problem with batch access to GenBank

Brian Osborne brian_osborne at cognia.com
Tue Jun 3 14:44:24 EDT 2003


Anne-Marie,

> So I wonder if there might be some obscure difference in implementation
> between "get_Seq_by_acc" and "get_Stream_by_acc".

Without seeing your code it's hard to know for certain but you may be doing
something odd with your stream. From the bptutorial:

  $gb = new Bio::DB::GenBank();
  # this returns a Seq object :
  $seq1 = $gb->get_Seq_by_id('MUSIGHBA1');
  # this returns a Seq object :
  $seq2 = $gb->get_Seq_by_acc('AF303112');
  # this returns a SeqIO object :
  $seqio = $gb->get_Stream_by_id(["J00522","AF303112","2981014"]);

If this is not helpful then you might consider showing us the offending
code.

Brian O.


-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Anne-Marie Ternes
Sent: Tuesday, June 03, 2003 1:11 PM
To: bioperl-l at bioperl.org
Cc: Anne-Marie Ternes
Subject: [Bioperl-l] problem with batch access to GenBank

Hello,

I've got a problem for submitting a batch of gene accession numbers to
GenBank.

I simply use the Bio::DB::GenBank package for creating a new GenBank
instance, and then use the "get_Stream_by_acc" method, passing it an
array of accession numbers as parameter.
Unfortunately, I don't get any results from this query. It seems to run
OK, as I get no connection error messages, it just ignores my subsequent
"while" loop looking for the next sequence in the result set.

Although I'm a PERL newbie, I'm not a network dummy, so I'm totally sure
I'm not behind a proxy, nor behind a firewall. I have unset any
environment variables related to proxies.
I'm also sure that the array effectively contains accession numbers in
correct formats.

Another bit of code, that resembles the problematic one for one single
difference: it uses "get_Seq_by_acc" with a single accession number as
parameter, runs perfectly:
information regarding enzyme numbers are correctly retrieved, and later
passed to a KEGG query.

So I wonder if there might be some obscure difference in implementation
between "get_Seq_by_acc" and "get_Stream_by_acc".

By the way, I have also checked using "netstat" that connections to NCBI
get properly established, which is the case.

If anyone has an idea what might be going on, I can't say how glad I'd
be! I'm in fact a bioinformatics MSc student, and ironically, my code
runs perfectly on my pal's computers, but not on mine ;-)

Thanks a lot in advance,

Anne-Marie Ternes
amternes at pt dot lu

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list