[Bioperl-l] Genbank files

Roy Chaudhuri roy.chaudhuri at gmail.com
Tue Dec 13 11:52:05 UTC 2011


Hi Brian,

Just to check I have understood you, you want to read through a genbank 
file and add additional tags to features which are listed in a 
tab-delimited file of locus tags?

Your code is on the right lines, but it would be much more efficient to 
read your tab-delimited locus_tags into a hash, and check using exists, 
rather than ploughing through the (potentially very long) list of locus 
tags every time. Also, be careful with new lines in your tab file (you 
can safely get rid of them using "chomp"). You can miss out the 
"has_tag" check by using "get_tagset_values" instead of 
"get_tag_values", since the former does not complain if the tag is not 
present. Once you have modified your sequence object, you need to write 
it out to a new file (or STDOUT) using Bio::SeqIO.

Also, just a couple of general points, you should always "use warnings" 
(or even better "use warnings FATAL=>qw(all)") since that can help solve 
many problems, and your code may be easier to read if you don't include 
the word "object" in all your variable names (after all you wouldn't say 
you write on a paper object using a pen object).

use strict;
use warnings FATAL=>qw(all);
use Bio::SeqIO;
open (my $list, 'list') or die $!;
my %V;
while (<$list>){
     chomp;
     $V{(split(/\t/, $_))[0]}=1;
}
my $seqio_object = Bio::SeqIO->new(-file=>"Contig100.gb");
my $seq_object = $seqio_object->next_seq;
for my $feat_object ($seq_object->remove_SeqFeatures){
     if ($feat_object->primary_tag eq "CDS"){
	for my $V3 ($feat_object->get_tagset_values('locus_tag')){
             if (exists $V{$V3}){
		$feat_object->add_tag_value(listed_in_tab_file=>'yes');
		next;
             }
         }
     }
     $seq_object->add_SeqFeature($feat_object);
}
Bio::SeqIO->new(-format=>'genbank')->write_seq($seq_object);

Hope this helps.
Cheers,
Roy.

On 13/12/2011 11:03, BForde wrote:
>
> Than you for the replies.
>
> My script (below) reads in a list of locus_tags from a tab delimited text
> file. Compares these locus_tags to the locus_tags in  a genbank file and
> where they are equal adds new features.
> the line
> $feat->add_tag_value()
> needs to be defined. In the bioperl wiki this variable appears to be defined
> by giving it coordinates etc (creating a new feature). I wish to add
> features to CDS key when the locus_tags are identical. Is this possible?
>
> use strict;
> use Bio::SeqIO;
>
> my @V;
> open (LIST1, 'list') ||die;
> while (<LIST1>){
>      push @V, (split(/\t/, $_))[0];
> }
> close(LIST1);
>
> my $seqio_object = Bio::SeqIO->new(-file=>"Contig100.gb");
> my $seq_object = $seqio_object->next_seq;
>
> for my $feat_object ($seq_object->get_SeqFeatures){
>      if ($feat_object->primary_tag eq "CDS"){
>          if ($feat_object->has_tag('locus_tag')){
>              for my $V3 ($feat_object->get_tag_values('locus_tag')){
>                  for my $V1 (@V) {
>                      if ($V1 eq $V3){
>                          ADD NEW FEATURES
>
>                      }
>                  }
>              }
>          }
>      }
> }
>
> The script works down as far as the comparison point where locus_tags in the
> genbankfile "Contig100.gb" are compared against a list of locus_tags from a
> delimited txt file.
>
>
> regards
>
> Brian
>
> Jason Stajich-5 wrote:
>>
>> $feature->add_tag_value('color','blue');
>>
>> On Dec 9, 2011, at 8:52 AM, BForde wrote:
>>
>>>
>>> Hello all,
>>>
>>> I am new to Bioperl so I apologise if this is stupid question.
>>>
>>> For CDS features I which to add additional qualifiers e.g. /colour and
>>> /note
>>> qualifiers. I have looked at the BioPerl wiki but am still unsure as how
>>> to
>>> do this?
>>>
>>> regards
>>>
>>> Brian
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Genbank-files-tp32941955p32941955.html
>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>
>>> _______________________________________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>> Jason Stajich
>> jason.stajich at gmail.com
>> jason at bioperl.org
>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>
>>
>




More information about the Bioperl-l mailing list