[Bioperl-l] question of seq() method of Bio::DB::GFF and Bio::PrimarySeq
Scott Cain
scott at scottcain.net
Thu Jan 27 13:37:19 UTC 2011
Hi Shuai,
The seq method does not return a sequence string, but a sequence
object (Bio::Seq), on which you can perform many methods, including
just getting the sequence string, but you can also doing things like
translating and getting the reverse complement. To get the DNA
string, call the seq() method on your Bio::Seq object:
my $upstream->seq->seq;
The first invocation of seq is calling the seq method in Bio::DB::GFF,
which returns the Bio:Seq object, and the second call of seq calls the
seq method in Bio::Seq, which returns a string. The dna method in
Bio::DB::GFF is a short cut for doing exactly this (I think--it's been
a while since I've used that).
Scott
2011/1/27 Zhan, Shuai <Shuai.Zhan at umassmed.edu>:
> Dear BIOPERL developers,
>
> I'm a post doc in UMASS Medical School.
>
> Recently I carried out some genome-wide annotation work, but faced some difficulty with using Bio::DB::GFF to fetch DNA sequences, whose seq() method is required by another critical softwares.
>
> I found the key issue is that the method of seq() of Bio::DB::GFF (inherited from Bio::PrimarySeq?) can't exactly return the actual sequence string but something like reference. This question made me upset for several days and stoped my progress, but I have not found any related help from google or my friends. So I'm assuming you could give me some valuable information.
>
> My perl is 5.8.8 and system is CentOS 5.3.
> I loaded gff file with annotation information both of CDS and reference by bp_load_gff.pl. The raw fasta sequence file was also loaded together.
>
> I wrote a test script just like your instruction on website.
> [code]
> use Bio::DB::GFF;
> my $db = Bio::DB::GFF->new(-dsn => 'dbi:mysql:brenneri', -user => 'me', -password => '123',);
> my $contig0 = $db->segment(-class => 'Sequence', -name => 'Contig0');
> my $subcontig = $contig0->subseq(1,100);
> print "returned by seq(): ", $upstream->seq, "\n";
> print "returned by dna(): ", $upstream->dna, "\n\n";
> my $transcripts = $db->segment(-class => 'Transcript', -name => 'fgp17221.t1');
> my @exons = $transcripts->features('CDS');
> print "returned by seq(): ", $exons[0]->seq "\n";
> print "returned by seq(): ", $exons[0]->dna "\n\n";
> my $upstream = $exons[0]->subseq(-20,0);
> print "returned by seq(): ", $upstream->seq "\n";
> print "returned by seq(): ", $upstream->dna "\n\n";
> [/code]
>
> The result as follows:
> $perl test.pl
> returned by seq(): Bio::PrimarySeq=HASH(0x1afa40d0)
> returned by dna(): GTAGATCACTTTTTATTCTCGAAGAATAATTTTTGAGCTATGTAAAAGCGTGTATACCTCCCAGTTTTCCGGTTAAAAGTCCAGGATCGCATGTCTCATG
>
> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3ab0)
> returned by dna(): AAGGAGCTCAACTTCAAACACCAAACACTTTGAACCAACATTTTTCTGGAGAAAACGTAAGAATTGAAAAGTCAAGAATGAATCCACATGTGAAAATTGAACAGAGAT
> CAATGGCAATGGGTGGAGGAGGAGGGGATGAACAAATGAAAAACTTTACGGAAATGACGAATGAAGAACTCCGTGAGCGGTTAATGAAAATGCAAATGGATATGCAGAATCTTC
> AAATGGCAATGG
>
> returned by seq(): Bio::PrimarySeq=HASH(0x1afa3fb0)
> returned by dna(): GGAGTTTGGACCGTGTTTCAG
>
> Because the method of dna() could exactly return the actual sequence, so I think my loaded database should work. But I have no idea of why the method seq() missed the sequence.
>
> I'd greatly appreciate any help.
>
> Sincerly,
> Shuai
> Jan 27 2010
>
> 364 Plantation Str.
> MA 01605
> UMASS MED
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the Bioperl-l
mailing list