[Bioperl-l] Get spliced sequences from a DB::Seqfeature::Store database
Darwin Sorento Dichmann
dichmann at berkeley.edu
Sun Jun 30 23:01:23 UTC 2013
Greetings,
I wish to extract the sequences of all mRNAs in a DB::Seqfeature::Store database, but I get the entire genomic region covered by a given transcript rather than the spliced sequence. I tried using the method spliced_seq but it is not supported, and selecting -type=>'CDS' yields the individual exons rather the full transcript.
I assume that I am missing something obvious and any pointers to how to solve this is greatly appreciated. Eventually I would like to get the CDS and translated AA sequence of these sequences and comments on how to most elegantly get that would also be very helpful.
The gff3 files describing the features should be OK, since they have been used in a Gbrowse database that draws the transcripts correctly.
Best wishes,
Darwin
Code:
#!/usr/bin/perl
# =================================================================
# = extract trancript sequences from Bio::DB::Seqfeature database =
# =================================================================
use strict;
use warnings;
use Bio::Seq;
use Bio::SeqIO;
use Bio::SeqFeatureI;
use Bio::DB::SeqFeature::Store;
my $db = Bio::DB::SeqFeature::Store->
new(-adaptor => 'DBI::mysql',
-dsn => 'DB_NAME',
-user => 'USER',
-pass => 'PASSWORD',
);
my $seq_stream = $db->get_seq_stream(
-type=>'mRNA',
); # Get all mRNAs in the genome.
while (my $seq = $seq_stream->next_seq) {
my $name = $seq->name;
print "This is the name: $name\n";
my $sequence = $seq->dna;
print "Sequence: $sequence\n"; # This prints the entire genomic region covered by the transcript.
}
exit;
-----------------------------------------------
Darwin Sorento Dichmann, M.S., PhD
Associate Specialist
Harland Lab
University of California, Berkeley
Molecular and Cell Biology
571 Life Sciences Addition
Berkeley, CA 94720
Phone# (510) 643-7830
E-mail: dichmann at berkeley.edu
More information about the Bioperl-l
mailing list