[Bioperl-l] Navigating a genbank file
Adlai Burman
adlai at refenestration.com
Fri Feb 24 19:24:18 UTC 2012
Thanks, Jason.
Your code looks elegant and should do the trick. Unfortunately it is a little over my quasi (perl) newbie head. I'm sure I will learn a lot from trying to figure out how to implement it. Who would have guessed that grabbing intergenic regions could be such a nosebleed (for perl flat-foots that is)?
In the meantime if you, or anybody else, can help me out with this in a more neophyte digestible way it would be excruciatingly appreciated.
Regards,
Adlai
On Feb 24, 2012, at 7:49 AM, Jason Stajich wrote:
> You just need a $last_CDS variable.
> Here's code that does this for genes retrieved from Bio::DB::SeqFeature but the concept is the same.
>
> https://github.com/hyphaltip/genome-scripts/blob/master/seqfeature/get_intergenic_seq.pl
>
> On Feb 23, 2012, at 4:43 PM, Adlai Burman wrote:
>
>> I am struggling with Bioperl to do something which I know is simple (I stumbled on it before, but it was just by dumb luck and I forgot how it is done). Basically what I am trying to do is, given a target gene symbol extract the intergenic region between that CDS and the next one 3'. One ugly way I could do this would be:
>> (1) Make an array of all of the CDSs in each gb file
>> (2) Make a second run through the file using either the symbol prior or post the target symbol (depending on the strand of the target symbol).
>>
>> This is, of course, cumbersome and unnecessary but I can't figure out how it should be done.
>> Here is a skeletal version of what I understand about getting feature info with bioperl.
>> Can anyone help me figure out how to access the CDS features on either side?
>> Please?
>> Thank you,
>>
>> Adlai
>>
>> #!/usr/bin/perl
>> use strict;
>> use warnings;
>> use IO::String;
>> use Bio::Perl;
>> use Bio::SeqIO;
>> use IO::String;
>>
>>
>> my $target_sym = shift;
>> my $file = "../Dropbox/local_gb/*";
>>
>> my $seqio = Bio::SeqIO-> new(
>> -file => $file,
>> -format => 'GenBank',
>> );
>>
>> my $seq = $seqio->next_seq;
>> for my $feats ($seq->get_SeqFeatures){
>> if ($feats->primary_tag eq "CDS"){
>> my $start = $feats->location->start;
>> my $end = $feats->location->end;
>> my $strand = $feats->strand;
>> if ($feats->has_tag('gene')) {
>> for my $val ($feats->get_tag_values('gene')){
>> if ($val eq $target_sym){
>> print $start."\n";
>> print "$val\n";
>> }
>>
>> }
>> }
>> }
>> }
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
> Jason Stajich
> jason.stajich at gmail.com
> jason at bioperl.org
>
More information about the Bioperl-l
mailing list