[Bioperl-l] Help on retrieving NT contigs with Bio::DB::GenBank.

Wiepert, Mathieu Wiepert.Mathieu@mayo.edu
Sat, 1 Dec 2001 09:35:30 -0600


Hi,
Jason is quite right, it was a total punt.  I ended up using  local copy of
refseqs here, we get a mirror daily(ish).  Interestingly, I was using
sfetch, from the hmmer package, to pull the seq's, yet another of the too
many ways to do things.  I do think long term it would be great to get
refseq's from ncbi, as well as from local db's.  The refseq accession
numbers are giving html rather than sequences, it's annoying, but that is
the way it is for now.  I actually need the primer on setting up local db's
for use in hmmer, genewise, and any other tools, following the recent
threads on mysql db's.

Anyway, if there is something I can do to help, I should try to do it, I
know I will need it.  Sorry for the ramble.

-Mat

 -----Original Message-----
From: 	Jason Eric Stajich [mailto:jason@cgt.mc.duke.edu] 
Sent:	Friday, November 30, 2001 5:41 PM
To:	Kun Zhang
Cc:	bioperl-l@bioperl.org
Subject:	Re: [Bioperl-l] Help on retrieving NT contigs with
Bio::DB::GenBank.

Kun-

Has to do with NCBI retrieve of RefSeq contig (NT_*)  NCBI does not
implement this in the same way as 'regular' accession numbers.

This is discussed in previous messages in the list and should make its way
into a wiki FAQ at some point if someone wants to jump on it.

We're currently rethinking how to best provide this functionality as we
are essentially limited by NCBI not providing a single CGI-BIN which maps
to our simple Bio::DB::RandomAccessI interface for all valid accession
numbers.  Ideas and volunteers to take this on welcomed.  Mathew Wiepert
at Mayo started to look a it.  I suspect he got to the same point I did
which meant parsing HTML and doing a 2-step retrieval method, at which
point I balked.

-jason

On Fri, 30 Nov 2001, Kun Zhang wrote:

> Hello!
>
> I got a error message (attached below) when trying to retrieving some NT
> contigs from GenBank with the Bio::DB::Genbank module. It looks like the
> problem occurs only on NT sequence because the getGenBank.pl came with the
> bioperl-0.9.0 distribution works fine. And my perl script works when I
> replacing the "NT_001035" with "AF303112". Can anyone help me out? Thanks!
>
> Kun Zhang
> Human Genetics Center
> University of Texas-Houston
>
> ------------------------My codes-----------------------------
> my $gb = new Bio::DB::GenBank;
> $gb->request_format('fasta') ;
> $contigSeq = $gb->get_Seq_by_acc('NT_001035');
>
>
>
> ==============================ERROR
MESSAGE==================================
> -------------------- EXCEPTION --------------------
> MSG: Attempting to set the sequence to [<HTML] which does not look healthy
> STACK Bio::PrimarySeq::seq
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/PrimarySeq.pm:251
> STACK Bio::PrimarySeq::new
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/PrimarySeq.pm:226
> STACK Bio::Seq::new /usr/local/lib/perl5/site_perl/5.6.1/Bio/Seq.pm:132
> STACK Bio::SeqIO::fasta::next_primary_seq
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/fasta.pm:130
> STACK Bio::SeqIO::fasta::next_seq
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/SeqIO/fasta.pm:85
> STACK Bio::DB::WebDBSeqI::get_Seq_by_acc
> /usr/local/lib/perl5/site_perl/5.6.1/Bio/DB/WebDBSeqI.pm:159
> STACK toplevel ./splitSeq.pl:26
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>

-- 
Jason Stajich
Duke University
jason@cgt.mc.duke.edu

_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l