[Bioperl-l] Still trouble with remote joined records
Warren Gallin
wgallin at ualberta.ca
Thu Oct 18 01:42:54 UTC 2007
I must be missing something, but I can not get the procedure outlined
in FAQ 5.5 to do what I think it should (maybe my expectations are
incorrect.
I think I have an up-to-date release:
GallinPowerbook:~ wgallin$ perl -MBio::Root::Version -e 'print
$Bio::Root::Version::VERSION,"\n"'
1.005002102
If I run the following script (not that gi7648673 is a protein record):
___________________________________________________________
#! /usr/bin/perl
#071017 - Small script to test the method for getting the ORF sequence
#using dbh passing into the sequence method.
use strict;
use warnings;
use LWP::Simple;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::Seq;
#Create handles to GenBank and GenPept
my $gbh = Bio::DB::GenBank -> new;
my $gph = Bio::DB::GenPept -> new;
#This is the gi number of the entry being tested
my $test_gi = 7648673;
#Get the protein sequence file from GENBANK
my $test_na_record = $gph -> get_Seq_by_gi ($test_gi);
#Go through the features - when the CDS feature is found splice
together the sequence and translate.
for my $feat ( $test_na_record->get_SeqFeatures ) {
if ( $feat->primary_tag eq 'CDS' ) {
my $cds = $feat->spliced_seq(-db => $gbh, -nosort => 1);
print $cds->translate->seq, "\n";
}
}
exit;
_______________________________________________________________________
I get an error that a protein sequence can not be translated. I
thought that by feeding the handle to GenBank into the spliced_seq
method that it would retrieve the necessary nucleic acid sequence
records and splice together the specified ranges.
So I tried using the corresponding nucleic acid record, gi7648671,
which holds the 5' end of the CDS ( I used $gbh on the get_Seq step).
That yielded the correct amino acid sequence for the first half the
protein, encoded by the sequence in the record itself, but it did not
retrieve the other nucleic acid record that is specified to contain
the 3' end of the sequence.
So, as far as I can see, passing the DB handle isn't causing the
spliced_seq method to go elsewhere for the nucleic acid sequence data.
I thought that was the purpose.
Can anyone enlighten me, or is this a bug?
Warren Gallin
More information about the Bioperl-l
mailing list