[Bioperl-l] Primary seq primary_id?

Wiepert, Mathieu Wiepert.Mathieu@mayo.edu
Fri, 8 Nov 2002 22:03:42 -0600


Hi,

Yep, whatever you did fixed it. I still changed my code as you suggested to use the display_id, thanks for that bit of advice.  RemoteBlast was changed accordingly.

-Mat

-----Original Message-----
From: Hilmar Lapp [mailto:hlapp@gnf.org]
Sent: Thursday, November 07, 2002 6:10 PM
To: Wiepert, Mathieu
Cc: Bioperl
Subject: Re: [Bioperl-l] Primary seq primary_id?


I think I found and fixed the bug in SeqFastaSpeedFactory. It was 
unaware that Bio::Seq does not delegate primary_id.

I believe you can CVS checkout/update from the repository, right? 
Can you check whether the problem is solved?

	-hilmar

On Thursday, November 7, 2002, at 12:32 PM, Wiepert, Mathieu wrote:

> Hi,
>
> Just so I can get this straight, fasta.pm is parsing my seq, 
> eventually the primary_id is set in SeqFastaSpeedFactory, for the 
> PrimarySeq object it creates.  Since I can look at the $seq object 
> after this, and see that primary_id is set, I can expect 
> Bio::Seq::primary_id to send it back to me?
>
> I had the same question as you about the POD, why is this method 
> *not* delegated to the         internal PrimarySeq object?
>
>
> If I ignore the POD, and change the code for the sub primary_id to be
>
> sub primary_id {
>  return shift->primary_seq->primary_id(@_);
> }
>
> The my program works, and the Seq, SeqIO tests still pass.  Is this 
> not a good fix though?
>
>
>
> FYI
> This is the very simple test program I have:
>
> #!/usr/bin/perl -w
> use Bio::SeqIO;
> use strict;
>
> my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' => 'fasta' );
> my $input = $seq->next_seq();
> my $primary_id = $input->primary_id;
> print $primary_id;
>
> And the object $input always looks like this (except the hash 
> reference of course ;-)
>
> 0  Bio::Seq=HASH(0x8626904)
>    'primary_seq' => Bio::PrimarySeq=HASH(0x86268e0)
>       'alphabet' => 'protein'
>       'desc' => 'fragment'
>       'display_id' => 'CYS1_DICDI'
>       'primary_id' => 'CYS1_DICDI'
>       'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'
>
> I'll go submit a bug I guess.
>
> Thanks for the help,
>
> -Mat
>
>> -----Original Message-----
>> From: Hilmar Lapp [mailto:hlapp@gnf.org]
>> Sent: Thursday, November 07, 2002 2:11 PM
>> To: Hilmar Lapp; Wiepert, Mathieu; bioperl-l@bioperl.org
>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>
>>
>> Sorry I was too fast. Please file it as a bug report.
>>
>> First, the POD of Bio::Seq::primary_id explicitly states that
>> it is not delegated to the primary_seq. Can anyone remember
>> why this is or why this should stay?
>>
>> Second, Bio::Seq::new does recognize and honor -primary_id, I
>> overlooked it. Can't be the problem.
>>
>> Needs to be investigated. Feel welcome to do so ...
>>
>> 	-hilmar
>>
>>> -----Original Message-----
>>> From: Hilmar Lapp
>>> Sent: Thursday, November 07, 2002 12:04 PM
>>> To: 'Wiepert, Mathieu'; bioperl-l@bioperl.org
>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>
>>>
>>> By calling $input->primary_id() :) Interestingly I just
>>> realized the fasta parser is among the few that set this
>>> property. It also appears to be recognized by PrimarySeq::new
>>> ... weird. File it as a bug report, I or others need to see
>>> whether we can reproduce this.
>>>
>>> You rarely want primary_id() BTW. A primary_id would be the
>>> GenBank GI number as an example. Usually what you're after
>>> for fasta-returned seqs is display_id.
>>>
>>> Ahem. I just see this _IS_ a bug. The problem is Bio::Seq
>>> implements primary_id itself, which it shouldn't do (it
>>> should delegate to the primary_seq object). Bio::Seq::new
>>> doesn't honor -primary_id (which is OK if it delegated).
>>>
>>> I'll fix this in a second.
>>>
>>> 	-hilmar
>>>
>>>> -----Original Message-----
>>>> From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
>>>> Sent: Thursday, November 07, 2002 11:49 AM
>>>> To: Hilmar Lapp; bioperl-l@bioperl.org
>>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>>
>>>>
>>>> Hi,
>>>>
>>>> So I am confused then.  The primary_id is set, that is what I
>>>> wanted, the object looks like this.  Should the primary_id
>>>> slot not be filled in this case?  The primary id was set in
>>>> the fast.pm module, in the next_seq sub.  I don't have an
>>>> accession number.
>>>>
>>>> This is what the object is looking like to me...
>>>> 0  Bio::Seq=HASH(0x853cfe0)
>>>>    'primary_seq' => Bio::PrimarySeq=HASH(0x853cfbc)
>>>>       'alphabet' => 'protein'
>>>>       'desc' => 'fragment'
>>>>       'display_id' => 'CYS1_DICDI'
>>>>       'primary_id' => 'CYS1_DICDI'
>>>>       'seq' => 'SCWSFSTTGNVEGQHFISQNKLVSLSEQNLVDCDHECMEYEGE'
>>>>
>>>> How am I supposed to get CYS1_DICDI from the primary_id field?
>>>>
>>>> -Mat
>>>>
>>>>> -----Original Message-----
>>>>> From: Hilmar Lapp [mailto:hlapp@gnf.org]
>>>>> Sent: Thursday, November 07, 2002 1:42 PM
>>>>> To: Wiepert, Mathieu; bioperl-l@bioperl.org
>>>>> Subject: RE: [Bioperl-l] Primary seq primary_id?
>>>>>
>>>>>
>>>>> You do get a string. It's just the memory location of the
>>>>> object to fulfill the requirement to return something which
>>>>> is unique in the application. If you don't like that string,
>>>>> set e.g. $input->primary_id($input->accession_number).
>>>>>
>>>>> 	-hilmar
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
>>>>>> Sent: Thursday, November 07, 2002 10:42 AM
>>>>>> To: 'bioperl-l@bioperl.org'
>>>>>> Subject: [Bioperl-l] Primary seq primary_id?
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am pretty sure that something is messed up for me.  When I
>>>>>> call Bio::Seq to get the primary_id of a sequence, I no
>>>>>> longer get a string...
>>>>>>
>>>>>> my $seq = Bio::SeqIO->new(-file=>'amino.fa' , '-format' =>
>>>>> 'fasta' );
>>>>>> my $input = $seq->next_seq();
>>>>>> my $primary_id = $input->primary_id;
>>>>>> print $primary_id;
>>>>>>
>>>>>> gives me
>>>>>>
>>>>>> Bio::Seq=HASH(0x82d88d4)
>>>>>>
>>>>>> Is there something really silly that I missed somewhere?  I
>>>>>> used to get strings...
>>>>>>
>>>>>> -Mat
>>>>>> _______________________________________________
>>>>>> Bioperl-l mailing list
>>>>>> Bioperl-l@bioperl.org
>>>>>> http://bioperl.org/mailman/listinfo/bioperl-l
>>>>>>
>>>>>
>>>>
>>>
>>
>>
--
-------------------------------------------------------------
Hilmar Lapp                            email: lapp at gnf.org
GNF, San Diego, Ca. 92121              phone: +1-858-812-1757
-------------------------------------------------------------