[Bioperl-l] Re: [Bioperl-guts-l] Notification: incoming/931

Jason Stajich jason@chg.mc.duke.edu
Wed, 21 Mar 2001 12:07:08 -0500 (EST)


Most NT contigs do not contain any sequence, they are just an
annotation  with references to clones.  So if you look at the sequence on
NCBI there  is no sequence so bioperl is really not sure what to do with
this.  Admittedly it should not balk, but that is the reason it is not
working for the NT accessions you list.  If it really did work in 0.6.2
then it is probably because we were using a different CGI script to query
-- I guess I really don't know what all the appropriate web querying
points are for entrez so I only followed the instructions on ncbi site and
that is the info you are getting back.  You might try querying with
the batch mode and see if it does anything different.

<<jason wishing there was a simple ncbi corba db model that we could just
query>>

-Jason



On Wed, 21 Mar 2001 bioperl-bugs@bioperl.org wrote:

> JitterBug notification
>
> new message incoming/931
>
> Message summary for PR#931
> 	From: Joe Ryan <jfryan@nhgri.nih.gov>
> 	Subject: bug in entrez retrieval
> 	Date: Wed, 21 Mar 2001 11:58:33 -0500
> 	0 replies 	0 followups
>
> ====> ORIGINAL MESSAGE FOLLOWS <====
>
> >From jfryan@nhgri.nih.gov Wed Mar 21 11:58:36 2001
> Received: from kronos.nhgri.nih.gov (nhgri.nih.gov [165.112.191.6] (may be forged))
> 	by pw600a.bioperl.org (8.11.2/8.11.2) with ESMTP id f2LGwZ225701
> 	for <bioperl-bugs@bio.perl.org>; Wed, 21 Mar 2001 11:58:36 -0500
> Received: (from jfryan@localhost)
> 	by kronos.nhgri.nih.gov (8.10.0/8.10.0) id f2LGwXD21860;
> 	Wed, 21 Mar 2001 11:58:33 -0500 (EST)
> Date: Wed, 21 Mar 2001 11:58:33 -0500
> From: Joe Ryan <jfryan@nhgri.nih.gov>
> To: bioperl-bugs@bio.perl.org
> Subject: bug in entrez retrieval
> Message-ID: <20010321115833.E12346@nhgri.nih.gov>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> X-Mailer: Mutt 1.0i
>
> Dear bioperl developers,
>
> I have recently started having problems with some code which uses
> bioperl to retrieve sequences from Entrez.  Some (probably most)
> accessions, work fine, but some true accessions are not being retrieved.
>
> Before upgrading from version .62 of bioperl we were having problems
> with using Accession numbers with versions.   (e.g. asking for
> NT_004705.1 would return NT_004705.2 which was the latest version
> of the sequence).
>
> After upgrading to version .70 we now have a bunch of accessions
> that fail completely.
>
> The following is some code which shows the problem.
>
> NAME: get_nt_length.pl
> ---------------------------------------------------------------------------
> #!/usr/local/bin/perl -w
>
> use strict;
> use Bio::DB::GenBank;
> use Bio::SeqIO;
>
> my $accession = shift @ARGV;
> my $out = Bio::SeqIO->new('-fh' => \*STDOUT, '-format' => 'Fasta');
> my $dbobj = Bio::DB::GenBank->new();
> my $seq = $dbobj->get_Seq_by_acc($accession);
> my $length = length $seq->seq();
> print "$accession has $length base pairs\n";
>
> ---------------------------------------------------------------------------
>
> The following accessions work: AF284033, NM_002739, BG370814
> The following fail: NT_004705, NT_019547
>
> Could someone let me know if this is a known bug and if there is
> an estimated time that this will be fixed.  Or if I am doing something
> wrong on my end.  I may be able to delve into the code a bit, if
> it looks like none of you will be able to get to it soon.  If someone
> wants to point me to the module that I should check that would save
> me some time.  I was also considering using "idfetch" from the NCBI
> toolkit as an alternative.
>
> Thanks,
> Joe
> --
> Joseph Ryan <jfryan@nhgri.nih.gov>
> National Human Genome Research Institute
>
>
>
>
> _______________________________________________
> Bioperl-guts-l mailing list
> Bioperl-guts-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-guts-l
>

Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center
http://www.chg.duke.edu/