[Bioperl-l] Re: Ready!

Jason Stajich jason@chg.mc.duke.edu
Wed, 6 Jun 2001 09:25:12 -0400 (EDT)


Including the list so I won't be a single point of failure on information
or suggestions.

On Tue, 5 Jun 2001, Roger Hall wrote:

> In sub to_FTstring, when location->start == location->end, only the
> location->start value is  returned by design. I would guess that this has
> more than one reason, and would not be the place to apply changes for 955.
> 

Clearly not the best programming design here by me, but was trying to
handle EMBL/GenBank FeatureTable printing first as that was the thorn in
my side. Truthfully to_FTstring should only be used by embl/genbank
directly.

Couple of options, extend to_FTstring to know how to print itsself in
swiss,genbank,embl formats based on a input parameter (bad design
methodology requiring a class to know all of its users) or follow your
solution below.

I vote for adding the code in swiss::_print_swissprot_FTHelper as you
suggest.  May need to do some testing with split/fuzzy locations and
swissprot feature table to make sure we handle that correctly as well
since those come from to_FTstring as well.

>     my($self) = @_;
>     if( $self->start == $self->end ) {
>        return $self->start;
>     }
>     my $str = $self->start . ".." . $self->end;
>     return $str;
> 
> Instead, a possibly better place to handle this is in
> swiss::_print_swissprot_FTHelper. It currently assumes that there is a
> string that represents a range:
> 
>    $fth->loc =~ /(\d+)\.\.(\d+)/;
>    $start = $1;
>    $end = $2;
> 
> But since to_FTstring returns only one value, and not a string that
> represents a range, no matches are found. I have solved the problem with the
> addition of this line immediately following '$end = $2;':
> 
>    if (!$start) { $start = $end = $fth->loc; }
> 
> This assumes that if there is not a range, then there is one value, AND that
> value is the start and end of the feature range.
> 
> I recommend this change if this problem is localized to Swiss, and not a
> general protein problem. Otherwise, to_FTstring might need to be rethought.
> 
> Which way to go?
> 
> And please verify: I should commit the change in the branch-07 branch AND in
> the bioperl-live branch...
> 
yes you should since this is a bug on both branches.

> Thanks!
> 
> Roger
> 
> The new outut:
> 
> ID   143B_BOVIN     STANDARD;      PRT;   245 AA.
> AC   P29358;
> DT   01-DEC-1992 (Rel. 24, Created)
> DT   01-FEB-1996 (Rel. 33, Last sequence update)
> DT   30-MAY-2000 (Rel. 39, Last annotation update)
> DE   14-3-3 PROTEIN BETA/ALPHA (PROTEIN KINASE C INHIBITOR PROTEIN-1)
> (KCIP-1).
> GN   YWHAB
> OS   Bos taurus (Bovine), and Ovis aries (Sheep).
> OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
> Mammalia;
> OC   Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; Bovidae;
> Bovinae;
> OC   Bos.
> RN   [1]
> RP   SEQUENCE.
> RC   SPECIES=Bovine;
> RX   MEDLINE=91108808; PubMed=1671102;
> RA   Isobe T., Ichimura T., Sunaya T., Okuyama T., Takahashi N., Kuwano R.,
> RA   Takahashi Y.
> RT   "Distinct forms of the protein kinase-dependent activator of tyrosine
> and
> RT   tryptophan hydroxylases."
> RL   J. Mol. Biol. 217:125-132(1991).
> RN   [2]
> RP   SEQUENCE OF 2-83; 121-186 AND 199-241.
> RC   SPECIES=Sheep; TISSUE=BRAIN;
> RX   MEDLINE=92283271; PubMed=1317796;
> RA   Toker A., Sellers L.A., Amess B., Patel Y., Harris A., Aitken A.
> RT   "Multiple isoforms of a protein kinase C inhibitor (KCIP-1/14-3-3) from
> RT   sheep brain. Amino acid sequence of phosphorylated forms."
> RL   Eur. J. Biochem. 206:453-461(1992).
> RN   [3]
> RP   SEQUENCE OF 2-23.
> RC   SPECIES=Sheep; TISSUE=BRAIN;
> RX   MEDLINE=90345949; PubMed=2143472;
> RA   Toker A., Ellis C.A., Sellers L.A., Aitken A.
> RT   "Protein kinase C inhibitor proteins. Purification from sheep brain and
> RT   sequence similarity to lipocortins and 14-3-3 protein."
> RL   Eur. J. Biochem. 191:421-429(1990).
> RN   [4]
> RP   PHOSPHORYLATION.
> RC   SPECIES=Sheep;
> RX   MEDLINE=95197587; PubMed=7890696;
> RA   Aitken A., Howell S., Jones D., Madrazo J., Patel Y.
> RT   "14-3-3 alpha and delta are the phosphorylated forms of raf-activating
> RT   14-3-3 beta and zeta. In vivo stoichiometric phosphorylation in brain
> at a
> RT   Ser-Pro-Glu-Lys motif."
> RL   J. Biol. Chem. 270:5706-5709(1995).
> RN   [5]
> RP   POST-TRANSLATIONAL MODIFICATIONS.
> RC   SPECIES=Sheep;
> RA   Aitken A., Patel Y., Martin H., Jones D., Robinson K., Madrazo J.,
> Howell
> RA   S.
> RT   "Electrospray mass spectroscopy analysis with online trapping of
> RT   posttranslationally modified mammalian and avian brain 14-3-3
> isoforms."
> RL   J. Protein Chem. 13:463-465(1994).
> CC   -!- FUNCTION: ACTIVATES TYROSINE AND TRYPTOPHAN HYDROXYLASES IN THE
> CC       PRESENCE OF CA(2+)/CALMODULIN-DEPENDENT PROTEIN KINASE II, AND
> CC       STRONGLY ACTIVATES PROTEIN KINASE C. IS PROBABLY A MULTIFUNCTIONAL
> CC       REGULATOR OF THE CELL SIGNALING PROCESSES MEDIATED BY BOTH
> CC       KINASES.
> CC   -!- SUBUNIT: HOMODIMER.
> CC   -!- SUBCELLULAR LOCATION: CYTOPLASMIC.
> CC   -!- ALTERNATIVE PRODUCTS: TWO FORMS ARE PRODUCED BY ALTERNATIVE
> CC       INITIATION.
> CC   -!- TISSUE SPECIFICITY: 14-3-3 PROTEINS ARE LOCALIZED IN NEURONS, AND
> CC       ARE AXONALLY TRANSPORTED TO THE NERVE TERMINALS. THEY MAY BE ALSO
> CC       PRESENT, AT LOWER LEVELS, IN VARIOUS OTHER EUKARYOTIC TISSUES.
> CC   -!- PTM: ISOFORM ALPHA DIFFERS FROM ISOFORM BETA IN BEING
> CC       PHOSPHORYLATED.
> CC   -!- SIMILARITY: BELONGS TO THE 14-3-3 FAMILY.
> DR   PIR; S13467; S13467.
> DR   PIR; S10804; S10804.
> DR   PIR; S23179; S23179.
> DR   InterPro; IPR000308; 14-3-3.
> DR   Pfam; PF00244; 14-3-3.
> DR   PRINTS; PR00305; 1433ZETA.
> DR   SMART; SM00101; 14_3_3.
> DR   PROSITE; PS00796; 1433_1.
> DR   PROSITE; PS00797; 1433_2.
> KW   Brain; Neurone; Phosphorylation; Acetylation; Multigene family;
> KW   Alternative initiation.
> FT   INIT_MET      0      0
> FT   CHAIN         1    245       14-3-3 PROTEIN BETA/ALPHA, LONG ISOFORM.
> FT   CHAIN         2    245       14-3-3 PROTEIN BETA/ALPHA, SHORT ISOFORM.
> FT   INIT_MET      2      2       FOR SHORT ISOFORM.
> FT   MOD_RES       1      1       ACETYLATION.
> FT   MOD_RES       2      2       ACETYLATION (IN SHORT ISOFORM).
> FT   MOD_RES     185    185       PHOSPHORYLATION.
> SQ   SEQUENCE   245 AA;  27950 MW;  AA91C2314D99549F CRC64;
>      TMDKSELVQK AKLAEQAERY DDMAAAMKAV TEQGHELSNE ERNLLSVAYK NVVGARRSSW
>      RVISSIEQKT ERNEKKQQMG KEYREKIEAE LQDICNDVLQ LLDKYLIPNA TQPESKVFYL
>      KMKGDYFRYL SEVASGDNKQ TTVSNSQQAY QEAFEISKKE MQPTHPIRLG LALNFSVFYY
>      EILNSPEKAC SLAKTAFDEA IAELDTLNEE SYKDSTLIMQ LLRDNLTLWT SENQGDEGDA
>      GEGEN
> //
> 
> 


Jason Stajich
jason@chg.mc.duke.edu
Center for Human Genetics
Duke University Medical Center 
http://www.chg.duke.edu/