[Bioperl-l] Extracting sequences from MySQL DB with Bio::DB::GFF

Fri Oct 15 17:34:02 UTC 2010

It's a beauty to behold!
Thanks Scott!

Best,
Darwin

On Oct 14, 2010, at 7:01 PM, Scott Cain wrote:

> Hi Darwin,
> 
> The "seq" method returns a Bio::PrimarySeq object (I forget why this
> is--I think it's to fulfill the contract with the Bio::SeqFeatureI
> interface).  To get the sequence, you can do a few things: 1) call the
> seq method on the Bio::PrimarySeq object, like this:
> 
>  my $dna = $segment->seq->seq;
> 
> or 2) use the dna method:
> 
>  my $dna = $segment->dna;
> 
> The perldoc section from Bio::DB::GFF::Segment is below.
> 
> Scott
> 
>       seq
> 
>        Title   : seq
>        Usage   : $s->seq
>        Function: get the sequence string for this segment
>        Returns : a Bio::PrimarySeq
>        Args    : none
>        Status  : Public
> 
>       Returns the sequence for this segment as a Bio::PrimarySeq.  (-) strand
>       segments are automatically reverse complemented
> 
>       The method is called dna() return the data as a simple sequence string.
> 
>       dna
> 
>        Title   : dna
>        Usage   : $s->dna
>        Function: get the DNA string for this segment
>        Returns : a string
>        Args    : none
>        Status  : Public
> 
>       Returns the sequence for this segment as a simple string. (-) strand
>       segments are automatically reverse complemented
> 
>       The method is also called protein().
> 
> 
> On Thu, Oct 14, 2010 at 9:16 PM, Darwin Sorento Dichmann
> <dichmann at berkeley.edu> wrote:
>> Greetings,
>> 
>> I am trying to extract sequences from a MySQL DB with Bio::DB::GFF. The database is the same as I use for my gbrowse2 and it works fine for that. Also, I can extract all sorts of features but not sequence.
>> 
>> When I run this script I thought I'd get a sequence output but apparently I get the memory address instead.
>> 
>> I have very little experience with perl or other programming, but I try to follow the directions in the manual. I suspect that I overlook something very basic and any help would be greatly appreciated.
>> 
>> Thanks,
>> Darwin
>> 
>> 
>> 
>> The script:
>> ------------
>> #! usr/bin/perl -w
>> # module to test if Bio::DB::GFF can be used to extract sequences from frog2 database.
>> 
>> use strict;
>> use Bio::Seq;
>> use Bio::SeqIO;
>> use Bio::DB::GFF;
>> 
>> # Open database
>> my $db = Bio::DB::GFF->
>> new(    -adaptor => 'DBI::mysql',
>>       -dsn     => 'frog2',
>>               -user => 'darwin',
>>               -password => '****',
>>               );
>> 
>> # fetch scaffold_1 (1-100000)
>> my $segment = $db->segment('scaffold_1', 1 => 100000) or die;
>> 
>> # get its DNA
>> my $dna = $segment->seq or die;
>> print $segment, "\n";
>> print $dna,"\n";
>> 
>> # get an iterator on all curated features of type 'exon' or 'intron'
>> # this prints all exons and intron with transcript name on screen
>>       my $iterator = $segment->get_seq_stream(-type     => ['mRNA']);
>>       while (my $s = $iterator->next_seq) {
>>   print $s,"\n";
>> }
>> 
>> exit;
>> ---------------
>> 
>> The output:
>> Macintosh:perlscripts darwin$ perl frog2_parser.pl
>> scaffold_1:1,100000
>> Bio::PrimarySeq=HASH(0x100bb4d68)
>> mRNA:pick(xt42f011730m)
>> mRNA:pick(xt42f014902m)
>> mRNA:pick(xt42f016160m)
>> mRNA:pick(xt42f017353m)
>> mRNA:pick(xt42f029332m)
>> ---------------
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>> 
> 
> 
> 
> -- 
> ------------------------------------------------------------------------
> Scott Cain, Ph. D.                                   scott at scottcain dot net
> GMOD Coordinator (http://gmod.org/)                     216-392-3087
> Ontario Institute for Cancer Research