[Bioperl-l] Extracting sequences from MySQL DB with Bio::DB::GFF

Fri Oct 15 02:01:24 UTC 2010

Hi Darwin,

The "seq" method returns a Bio::PrimarySeq object (I forget why this
is--I think it's to fulfill the contract with the Bio::SeqFeatureI
interface).  To get the sequence, you can do a few things: 1) call the
seq method on the Bio::PrimarySeq object, like this:

  my $dna = $segment->seq->seq;

or 2) use the dna method:

  my $dna = $segment->dna;

The perldoc section from Bio::DB::GFF::Segment is below.

Scott

       seq

        Title   : seq
        Usage   : $s->seq
        Function: get the sequence string for this segment
        Returns : a Bio::PrimarySeq
        Args    : none
        Status  : Public

       Returns the sequence for this segment as a Bio::PrimarySeq.  (-) strand
       segments are automatically reverse complemented

       The method is called dna() return the data as a simple sequence string.

       dna

        Title   : dna
        Usage   : $s->dna
        Function: get the DNA string for this segment
        Returns : a string
        Args    : none
        Status  : Public

       Returns the sequence for this segment as a simple string. (-) strand
       segments are automatically reverse complemented

       The method is also called protein().

On Thu, Oct 14, 2010 at 9:16 PM, Darwin Sorento Dichmann
<dichmann at berkeley.edu> wrote:
> Greetings,
>
> I am trying to extract sequences from a MySQL DB with Bio::DB::GFF. The database is the same as I use for my gbrowse2 and it works fine for that. Also, I can extract all sorts of features but not sequence.
>
> When I run this script I thought I'd get a sequence output but apparently I get the memory address instead.
>
> I have very little experience with perl or other programming, but I try to follow the directions in the manual. I suspect that I overlook something very basic and any help would be greatly appreciated.
>
> Thanks,
> Darwin
>
>
>
> The script:
> ------------
> #! usr/bin/perl -w
> # module to test if Bio::DB::GFF can be used to extract sequences from frog2 database.
>
> use strict;
> use Bio::Seq;
> use Bio::SeqIO;
> use Bio::DB::GFF;
>
> # Open database
> my $db = Bio::DB::GFF->
> new(    -adaptor => 'DBI::mysql',
>        -dsn     => 'frog2',
>                -user => 'darwin',
>                -password => '****',
>                );
>
> # fetch scaffold_1 (1-100000)
> my $segment = $db->segment('scaffold_1', 1 => 100000) or die;
>
> # get its DNA
>  my $dna = $segment->seq or die;
> print $segment, "\n";
> print $dna,"\n";
>
> # get an iterator on all curated features of type 'exon' or 'intron'
> # this prints all exons and intron with transcript name on screen
>        my $iterator = $segment->get_seq_stream(-type     => ['mRNA']);
>        while (my $s = $iterator->next_seq) {
>    print $s,"\n";
> }
>
> exit;
> ---------------
>
> The output:
> Macintosh:perlscripts darwin$ perl frog2_parser.pl
> scaffold_1:1,100000
> Bio::PrimarySeq=HASH(0x100bb4d68)
> mRNA:pick(xt42f011730m)
> mRNA:pick(xt42f014902m)
> mRNA:pick(xt42f016160m)
> mRNA:pick(xt42f017353m)
> mRNA:pick(xt42f029332m)
> ---------------
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                   scott at scottcain dot net
GMOD Coordinator (http://gmod.org/)                     216-392-3087
Ontario Institute for Cancer Research