[Bioperl-l] retrieving coding sequences from swissprot protein
accessions
Michael Bradley
mebradley at chem.ufl.edu
Tue Jun 1 11:43:56 EDT 2004
Hello all,
I would like to get at the coding sequence for a given protein with a
swissprot accession. I have done this with GenBank file in the past
using the following code. Does anyone know how to do this with swissprot ?
my $gp = new Bio::DB::GenPept;
my $gb = new Bio::DB::GenBank;
my $loc_factory = new Bio::Factory::FTLocationFactory;
my $prot_stream = $gp->get_Stream_by_acc($protein_gi);
while ( my $prot_seq = $prot_stream->next_seq() ) {
foreach my $feat ( $prot_seq->top_SeqFeatures ) {
if ( $feat->primary_tag eq 'CDS' ) {
# example: 'coded_by="U05729.1:1..122"'
my @coded_by = $feat->each_tag_value('coded_by');
my ($nuc_acc,$loc_str) = split /\:/, $coded_by[0];
my $nuc_obj = $gb->get_Seq_by_acc($nuc_acc);
# create Bio::Location object from a string
my $loc_object = $loc_factory->from_string($loc_str);
# create a Feature object by using a Location
my $feat_obj = new Bio::SeqFeature::Generic(-location =>$loc_object);
# associate the Feature object with the nucleotide Seq object
$nuc_obj->add_SeqFeature($feat_obj);
my $cds_obj = $feat_obj->spliced_seq;
print "CDS sequence is ",$cds_obj->seq,"\n\n";
} else {
print "No CDS for ", $prot_seq->id,"\n\n";
}
}
}
Thanks,
Michael Bradley
More information about the Bioperl-l
mailing list