[Bioperl-l] Genbank files
Brian Forde
b.m.forde at umail.ucc.ie
Tue Dec 13 14:22:01 UTC 2011
Hi Roy,
Thank you. That works perfectly. I have to confess that someone else told
me to use hashes but I could not get them to work.. Thanks again
regards
Brian
On Tue, Dec 13, 2011 at 11:52 AM, Roy Chaudhuri <roy.chaudhuri at gmail.com>wrote:
> Hi Brian,
>
> Just to check I have understood you, you want to read through a genbank
> file and add additional tags to features which are listed in a
> tab-delimited file of locus tags?
>
> Your code is on the right lines, but it would be much more efficient to
> read your tab-delimited locus_tags into a hash, and check using exists,
> rather than ploughing through the (potentially very long) list of locus
> tags every time. Also, be careful with new lines in your tab file (you can
> safely get rid of them using "chomp"). You can miss out the "has_tag" check
> by using "get_tagset_values" instead of "get_tag_values", since the former
> does not complain if the tag is not present. Once you have modified your
> sequence object, you need to write it out to a new file (or STDOUT) using
> Bio::SeqIO.
>
> Also, just a couple of general points, you should always "use warnings"
> (or even better "use warnings FATAL=>qw(all)") since that can help solve
> many problems, and your code may be easier to read if you don't include the
> word "object" in all your variable names (after all you wouldn't say you
> write on a paper object using a pen object).
>
> use strict;
> use warnings FATAL=>qw(all);
> use Bio::SeqIO;
> open (my $list, 'list') or die $!;
> my %V;
> while (<$list>){
> chomp;
> $V{(split(/\t/, $_))[0]}=1;
>
> }
> my $seqio_object = Bio::SeqIO->new(-file=>"**Contig100.gb");
> my $seq_object = $seqio_object->next_seq;
> for my $feat_object ($seq_object->remove_**SeqFeatures){
>
> if ($feat_object->primary_tag eq "CDS"){
> for my $V3 ($feat_object->get_tagset_**values('locus_tag')){
> if (exists $V{$V3}){
> $feat_object->add_tag_value(**listed_in_tab_file=>'yes');
> next;
> }
> }
> }
> $seq_object->add_SeqFeature($**feat_object);
> }
> Bio::SeqIO->new(-format=>'**genbank')->write_seq($seq_**object);
>
> Hope this helps.
> Cheers,
> Roy.
>
>
> On 13/12/2011 11:03, BForde wrote:
>
>>
>> Than you for the replies.
>>
>> My script (below) reads in a list of locus_tags from a tab delimited text
>> file. Compares these locus_tags to the locus_tags in a genbank file and
>> where they are equal adds new features.
>> the line
>> $feat->add_tag_value()
>> needs to be defined. In the bioperl wiki this variable appears to be
>> defined
>> by giving it coordinates etc (creating a new feature). I wish to add
>> features to CDS key when the locus_tags are identical. Is this possible?
>>
>> use strict;
>> use Bio::SeqIO;
>>
>> my @V;
>> open (LIST1, 'list') ||die;
>> while (<LIST1>){
>> push @V, (split(/\t/, $_))[0];
>> }
>> close(LIST1);
>>
>> my $seqio_object = Bio::SeqIO->new(-file=>"**Contig100.gb");
>> my $seq_object = $seqio_object->next_seq;
>>
>> for my $feat_object ($seq_object->get_SeqFeatures)**{
>> if ($feat_object->primary_tag eq "CDS"){
>> if ($feat_object->has_tag('locus_**tag')){
>> for my $V3 ($feat_object->get_tag_values(**'locus_tag')){
>> for my $V1 (@V) {
>> if ($V1 eq $V3){
>> ADD NEW FEATURES
>>
>> }
>> }
>> }
>> }
>> }
>> }
>>
>> The script works down as far as the comparison point where locus_tags in
>> the
>> genbankfile "Contig100.gb" are compared against a list of locus_tags from
>> a
>> delimited txt file.
>>
>>
>> regards
>>
>> Brian
>>
>> Jason Stajich-5 wrote:
>>
>>>
>>> $feature->add_tag_value('**color','blue');
>>>
>>> On Dec 9, 2011, at 8:52 AM, BForde wrote:
>>>
>>>
>>>> Hello all,
>>>>
>>>> I am new to Bioperl so I apologise if this is stupid question.
>>>>
>>>> For CDS features I which to add additional qualifiers e.g. /colour and
>>>> /note
>>>> qualifiers. I have looked at the BioPerl wiki but am still unsure as how
>>>> to
>>>> do this?
>>>>
>>>> regards
>>>>
>>>> Brian
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Genbank-**files-tp32941955p32941955.html<http://old.nabble.com/Genbank-files-tp32941955p32941955.html>
>>>> Sent from the Perl - Bioperl-L mailing list archive at Nabble.com.
>>>>
>>>> ______________________________**_________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/**mailman/listinfo/bioperl-l<http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>>>>
>>>
>>> Jason Stajich
>>> jason.stajich at gmail.com
>>> jason at bioperl.org
>>>
>>>
>>> ______________________________**_________________
>>> Bioperl-l mailing list
>>> Bioperl-l at lists.open-bio.org
>>> http://lists.open-bio.org/**mailman/listinfo/bioperl-l<http://lists.open-bio.org/mailman/listinfo/bioperl-l>
>>>
>>>
>>>
>>
>
--
Brian Forde
Microbiology Dept.
Bioscience Institute. Room 4.11
University College Cork
Cork
Ireland
tel:+353 21 4901306
email: b.m.forde at umail.ucc.ie
More information about the Bioperl-l
mailing list