[Bioperl-l] Get spliced sequences from a DB::Seqfeature::Store database
Fields, Christopher J
cjfields at illinois.edu
Tue Jul 2 22:12:32 UTC 2013
I believe this was discussed on-list at one point; the problem IIRC with spliced_seq() is that the current API doesn't indicate how to splice the sequence together if there are sub-features with different types present (e.g. exons, UTR, CDS, etc). It could be implemented but probably not as spliced_seq() as the API doesn't expect any arguments for this purpose.
chris
On Jun 30, 2013, at 6:01 PM, Darwin Sorento Dichmann <dichmann at berkeley.edu> wrote:
> Greetings,
>
> I wish to extract the sequences of all mRNAs in a DB::Seqfeature::Store database, but I get the entire genomic region covered by a given transcript rather than the spliced sequence. I tried using the method spliced_seq but it is not supported, and selecting -type=>'CDS' yields the individual exons rather the full transcript.
>
> I assume that I am missing something obvious and any pointers to how to solve this is greatly appreciated. Eventually I would like to get the CDS and translated AA sequence of these sequences and comments on how to most elegantly get that would also be very helpful.
>
> The gff3 files describing the features should be OK, since they have been used in a Gbrowse database that draws the transcripts correctly.
>
> Best wishes,
> Darwin
>
> Code:
>
> #!/usr/bin/perl
> # =================================================================
> # = extract trancript sequences from Bio::DB::Seqfeature database =
> # =================================================================
> use strict;
> use warnings;
> use Bio::Seq;
> use Bio::SeqIO;
> use Bio::SeqFeatureI;
> use Bio::DB::SeqFeature::Store;
>
> my $db = Bio::DB::SeqFeature::Store->
> new(-adaptor => 'DBI::mysql',
> -dsn => 'DB_NAME',
> -user => 'USER',
> -pass => 'PASSWORD',
> );
>
> my $seq_stream = $db->get_seq_stream(
> -type=>'mRNA',
> ); # Get all mRNAs in the genome.
>
>
> while (my $seq = $seq_stream->next_seq) {
> my $name = $seq->name;
> print "This is the name: $name\n";
> my $sequence = $seq->dna;
> print "Sequence: $sequence\n"; # This prints the entire genomic region covered by the transcript.
> }
> exit;
>
>
> -----------------------------------------------
> Darwin Sorento Dichmann, M.S., PhD
> Associate Specialist
> Harland Lab
> University of California, Berkeley
> Molecular and Cell Biology
> 571 Life Sciences Addition
> Berkeley, CA 94720
> Phone# (510) 643-7830
> E-mail: dichmann at berkeley.edu
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list