[Bioperl-l] Primary seq primary_id?
Hilmar Lapp
hlapp@gnf.org
Thu, 7 Nov 2002 16:10:27 -0800
I think I found and fixed the bug in SeqFastaSpeedFactory. It was
unaware that Bio::Seq does not delegate primary_id.
I believe you can CVS checkout/update from the repository, right?
Can you check whether the problem is solved?
-hilmar
On Thursday, November 7, 2002, at 12:32 PM, Wiepert, Mathieu wrote:
> Hi,
>
> Just so I can get this straight, fasta.pm is parsing my seq,
> eventually the primary_id is set in SeqFastaSpeedFactory, for the
> PrimarySeq object it creates. Since I can look at the $seq object
> after this, and see that primary_id is set, I can expect
> Bio::Seq::primary_id to send it back to me?
>
> I had the same question as you about the POD, why is this method
> *not* delegated to the internal PrimarySeq object?
>
>
> If I ignore the POD, and change the code for the sub primary_id to be
>
> sub primary_id {
> return shift->primary_seq->primary_id(@_);
> }
>
> The my program works, and the Seq, SeqIO tests still pass. Is this
> not a good fix though?
>
>
>
> FYI
> This is the very simple test program I have:
>
> #!/usr/bin/perl -w
> use Bio::SeqIO;
> use strict;
>
> my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' );
> my $input = $seq->next_seq();
> my $primary_id = $input->primary_id;
> print $primary_id;
>
> And the object $input always looks like this (except the hash
> reference of course ;-)
>
> 0 Bio::Seq=HASH(0x8626904)
> 'primary_seq' => Bio::PrimarySeq=HASH(0x86268e0)
> 'alphabet' => 'protein'
> 'desc' => 'fragment'
> 'display_id' => 'CYS1_DICDI'
> 'primary_id' => 'CYS1_DICDI'
> 'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'
>
> I'll go submit a bug I guess.
>
> Thanks for the help,
>
> -Mat
>
>> -----Original Message-----
>> From: Hilmar Lapp [mailto:hlapp@gnf.org]
>> Sent: Thursday, November 07, 2002 2:11 PM
>> To: Hilmar Lapp; Wiepert, Mathieu; bioperl-l@bioperl.org
>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>
>>
>> Sorry I was too fast. Please file it as a bug report.
>>
>> First, the POD of Bio::Seq::primary_id explicitly states that
>> it is not delegated to the primary_seq. Can anyone remember
>> why this is or why this should stay?
>>
>> Second, Bio::Seq::new does recognize and honor -primary_id, I
>> overlooked it. Can't be the problem.
>>
>> Needs to be investigated. Feel welcome to do so ...
>>
>> -hilmar
>>
>>> -----Original Message-----
>>> From: Hilmar Lapp
>>> Sent: Thursday, November 07, 2002 12:04 PM
>>> To: 'Wiepert, Mathieu'; bioperl-l@bioperl.org
>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>
>>>
>>> By calling $input->primary_id() :) Interestingly I just
>>> realized the fasta parser is among the few that set this
>>> property. It also appears to be recognized by PrimarySeq::new
>>> ... weird. File it as a bug report, I or others need to see
>>> whether we can reproduce this.
>>>
>>> You rarely want primary_id() BTW. A primary_id would be the
>>> GenBank GI number as an example. Usually what you're after
>>> for fasta-returned seqs is display_id.
>>>
>>> Ahem. I just see this _IS_ a bug. The problem is Bio::Seq
>>> implements primary_id itself, which it shouldn't do (it
>>> should delegate to the primary_seq object). Bio::Seq::new
>>> doesn't honor -primary_id (which is OK if it delegated).
>>>
>>> I'll fix this in a second.
>>>
>>> -hilmar
>>>
>>>> -----Original Message-----
>>>> From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
>>>> Sent: Thursday, November 07, 2002 11:49 AM
>>>> To: Hilmar Lapp; bioperl-l@bioperl.org
>>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>>
>>>>
>>>> Hi,
>>>>
>>>> So I am confused then. The primary_id is set, that is what I
>>>> wanted, the object looks like this. Should the primary_id
>>>> slot not be filled in this case? The primary id was set in
>>>> the fast.pm module, in the next_seq sub. I don't have an
>>>> accession number.
>>>>
>>>> This is what the object is looking like to me...
>>>> 0 Bio::Seq=HASH(0x853cfe0)
>>>> 'primary_seq' => Bio::PrimarySeq=HASH(0x853cfbc)
>>>> 'alphabet' => 'protein'
>>>> 'desc' => 'fragment'
>>>> 'display_id' => 'CYS1_DICDI'
>>>> 'primary_id' => 'CYS1_DICDI'
>>>> 'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'
>>>>
>>>> How am I supposed to get CYS1_DICDI from the primary_id field?
>>>>
>>>> -Mat
>>>>
>>>>> -----Original Message-----
>>>>> From: Hilmar Lapp [mailto:hlapp@gnf.org]
>>>>> Sent: Thursday, November 07, 2002 1:42 PM
>>>>> To: Wiepert, Mathieu; bioperl-l@bioperl.org
>>>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>>>
>>>>>
>>>>> You do get a string. It's just the memory location of the
>>>>> object to fulfill the requirement to return something which
>>>>> is unique in the application. If you don't like that string,
>>>>> set e.g. $input->primary_id($input->accession_number).
>>>>>
>>>>> -hilmar
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
>>>>>> Sent: Thursday, November 07, 2002 10:42 AM
>>>>>> To: 'bioperl-l@bioperl.org'
>>>>>> Subject: [Bioperl-l] Primary seq primary_id?
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am pretty sure that something is messed up for me. When I
>>>>>> call Bio::Seq to get the primary_id of a sequence, I no
>>>>>> longer get a string...
>>>>>>
>>>>>> my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' =>
>>>>> 'fasta' );
>>>>>> my $input = $seq->next_seq();
>>>>>> my $primary_id = $input->primary_id;
>>>>>> print $primary_id;
>>>>>>
>>>>>> gives me
>>>>>>
>>>>>> Bio::Seq=HASH(0x82d88d4)
>>>>>>
>>>>>> Is there something really silly that I missed somewhere? I
>>>>>> used to get strings...
>>>>>>
>>>>>> -Mat
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l@bioperl.org
>>>>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>
>>>
>>
>>
--
-------------------------------------------------------------
Hilmar Lapp email: lapp at gnf.org
GNF, San Diego, Ca. 92121 phone: +1-858-812-1757
-------------------------------------------------------------