[Biojava-l] RefSeq bioJava parser problem

Cox, Greg gcox@netgenics.com
Tue, 14 May 2002 15:34:15 -0400


Unfortunately, not.  This is probably the weakest point in BioJava's parsing
right now.  

As you may have noticed, there's a more serious problem with the reference
information.  If a reference doesn't contain a field that others do, nothing
is added under that key, causing them to get out of sync.  For example:

REFERENCE
	TITLE foo
	TITLE bar
	AUTHOR wanner

When this gets turned into a biojava sequence, TITLE has [foo, bar] and
AUTHOR has [wanner] but there's no way to tell which one wanner goes with.
Good luck

Greg
	

> -----Original Message-----
> From: wanner.de@pg.com [mailto:wanner.de@pg.com]
> Sent: Tuesday, May 14, 2002 11:41 AM
> To: biojava-l@biojava.org
> Subject: [Biojava-l] RefSeq bioJava parser problem
> 
> 
> Hi,
> 
> Appreciate the responses to the refSeq question. We've been 
> able to put togther
> a reliable parser using the example in TestRefSeqPrt.
> 
> Have an additional question now.   Are there any utility 
> methods within bioJava
> that can be used to handle parsed values that are returned by 
> bioJava in list
> form.
> 
> For example the following value was returned from bioJava for 
> a sequence
> annotation with key MEDLINE:
> 
>      [98127055, 99357812]
> 
> 
> Another example is the value that was returned from bioJava 
> for a feature annotation with key  db_xref:
> 
>      [LocusID:946, MIM:604405]
> 
> bioJava does good work in accumulating the information 
> together and placing it under a specific annotation, does
> anyone know if there are method to extract listMembers or 
> parameter/value pairs already available in bioJava?
> 
> thx,
> Dave
> 
> > > LOCUS       NP_000221                167 aa
> > linear   PRI 29-JAN-2002
> > > DEFINITION  leptin precursor; leptin (murine obesity
> > homolog); obesity; obesity
> > >             (murine homolog, leptin) [Homo sapiens].
> > > ACCESSION   NP_000221
> > > PID         g4557715
> > > VERSION     NP_000221.1  GI:4557715
> > > DBSOURCE    REFSEQ: accession NM_000230.1
> > > KEYWORDS    .
> > > SOURCE      human.
> > >   ORGANISM  Homo sapiens
> > >             Eukaryota; Metazoa; Chordata; Craniata;
> > Vertebrata; Euteleostomi;
> > >             Mammalia; Eutheria; Primates; Catarrhini;
> > Hominidae; Homo.
> > > REFERENCE   1  (residues 1 to 167)
> > >   AUTHORS   Friedman JM, Leibel RL, Siegel DS, Walsh J 
> and Bahary N.
> > >   TITLE     Molecular mapping of the mouse ob mutation
> > >   JOURNAL   Genomics 11 (4), 1054-1062 (1991)
> > >   MEDLINE   92147101
> > >    PUBMED   1686014
> > > REFERENCE   2  (residues 1 to 167)
> > >   AUTHORS   Zhang Y, Proenca R, Maffei M, Barone M, Leopold
> > L and Friedman JM.
> > >   TITLE     Positional cloning of the mouse obese gene and
> > its human homologue
> > >   JOURNAL   Nature 372 (6505), 425-432 (1994)
> > >   MEDLINE   95075453
> > >    PUBMED   7984236
> > >   REMARK    Erratum:[[published erratum appears in Nature 1995 Mar
> > >             30;374(6521):479]]
> > > REFERENCE   3  (residues 1 to 167)
> > >   AUTHORS   Masuzaki H, Ogawa Y, Isse N, Satoh N, Okazaki
> > T, Shigemoto M, Mori
> > >             K, Tamura N, Hosoda K, Yoshimasa Y et al.
> > >   TITLE     Human obese gene expression. Adipocyte-specific
> > expression and
> > >             regional differences in the adipose tissue
> > >   JOURNAL   Diabetes 44 (7), 855-858 (1995)
> > >   MEDLINE   95309556
> > >    PUBMED   7789654
> > > REFERENCE   4  (residues 1 to 167)
> > >   AUTHORS   Green ED, Maffei M, Braden VV, Proenca R,
> > DeSilva U, Zhang Y, Chua
> > >             SC Jr, Leibel RL, Weissenbach J and Friedman JM.
> > >   TITLE     The human obese (OB) gene: RNA expression
> > pattern and mapping on
> > >             the physical, cytogenetic, and genetic maps of
> > chromosome 7
> > >   JOURNAL   Genome Res. 5 (1), 5-12 (1995)
> > >   MEDLINE   96352898
> > >    PUBMED   8717050
> > > REFERENCE   5  (residues 1 to 167)
> > >   AUTHORS   Isse N, Ogawa Y, Tamura N, Masuzaki H, Mori K,
> > Okazaki T, Satoh N,
> > >             Shigemoto M, Yoshimasa Y, Nishi S et al.
> > >   TITLE     Structural organization and chromosomal
> > assignment of the human
> > >             obese gene
> > >   JOURNAL   J. Biol. Chem. 270 (46), 27728-27733 (1995)
> > >   MEDLINE   96070903
> > >    PUBMED   7499240
> > > REFERENCE   6  (residues 1 to 167)
> > >   AUTHORS   Gong,D.W., Bi,S., Pratley,R.E. and Weintraub,B.D.
> > >   TITLE     Genomic structure and promoter analysis of the
> > human obese gene
> > >   JOURNAL   J. Biol. Chem. 271 (8), 3971-3974 (1996)
> > >   MEDLINE   96223958
> > > REFERENCE   7  (residues 1 to 167)
> > >   AUTHORS   Niki T, Mori H, Tamori Y, Kishimoto-Hashirmoto
> > M, Ueno H, Araki S,
> > >             Masugi J, Sawant N, Majithia HR, Rais N et al.
> > >   TITLE     Human obese gene: molecular screening in
> > Japanese and Asian Indian
> > >             NIDDM patients associated with obesity
> > >   JOURNAL   Diabetes 45 (5), 675-678 (1996)
> > >   MEDLINE   96198511
> > >    PUBMED   8621021
> > > REFERENCE   8  (residues 1 to 167)
> > >   AUTHORS   Comuzzie,A.G., Hixson,J.E., Almasy,L.,
> > Mitchell,B.D., Mahaney,M.C.,
> > >             Dyer,T.D., Stern,M.P., MacCluer,J.W. and Blangero,J.
> > >   TITLE     A major quantitative trait locus determining
> > serum leptin levels
> > >             and fat mass is located on human chromosome 2
> > >   JOURNAL   Nat. Genet. 15 (3), 273-276 (1997)
> > >   MEDLINE   97207647
> > >    PUBMED   9054940
> > > REFERENCE   9  (residues 1 to 167)
> > >   AUTHORS   Clement,K., Vaisse,C., Lahlou,N., Cabrol,S., 
> Pelloux,V.,
> > >             Cassuto,D., Gourmelen,M., Dina,C., Chambaz,J.,
> > Lacorte,J.M.,
> > >             Basdevant,A., Bougneres,P., Lebouc,Y.,
> > Froguel,P. and Guy-Grand,B.
> > >   TITLE     A mutation in the human leptin receptor gene
> > causes obesity and
> > >             pituitary dysfunction
> > >   JOURNAL   Nature 392 (6674), 398-401 (1998)
> > >   MEDLINE   98196670
> > >    PUBMED   9537324
> > > REFERENCE   10 (residues 1 to 167)
> > >   AUTHORS   Friedman,J.M. and Halaas,J.L.
> > >   TITLE     Leptin and the regulation of body weight in mammals
> > >   JOURNAL   Nature 395 (6704), 763-770 (1998)
> > >   MEDLINE   99010835
> > > COMMENT     REVIEWED REFSEQ: This record has been curated
> > by NCBI staff. The
> > >             reference sequence was derived from U43653.1.
> > >             Summary: This gene is similar to the mouse
> > obesity gene (ob). The
> > >             protein encoded by this gene is secreted by
> > white adipocytes. In
> > >             the mouse study, mutations in this gene are
> > linked to severe and
> > >             morbid obesity.
> > > FEATURES             Location/Qualifiers
> > >      source          1..167
> > >                      /organism="Homo sapiens"
> > >                      /db_xref="taxon:9606"
> > >                      /chromosome="7"
> > >                      /map="7q31.3"
> > >      Protein         1..167
> > >                      /product="leptin precursor"
> > >                      /note="leptin (murine obesity
> > homolog); obesity (murine
> > >                      homolog, leptin)"
> > >      sig_peptide     1..21
> > >      Region          22..167
> > >                      /region_name="Leptin"
> > >                      /note="Leptin"
> > >                      /db_xref="CDD:pfam02024"
> > >      mat_peptide     22..167
> > >                      /product="leptin"
> > >      CDS             1..167
> > >                      /gene="LEP"
> > >                      /coded_by="NM_000230.1:57..560"
> > >                      /db_xref="LocusID:3952"
> > >                      /db_xref="MIM:164160"
> > > ORIGIN
> > >         1 mhwgtlcgfl wlwpylfyvq avpiqkvqdd tktliktivt
> > rindishtqs vsskqkvtgl
> > >        61 dfipglhpil tlskmdqtla vyqqiltsmp srnviqisnd
> > lenlrdllhv lafskschlp
> > >       121 wasgletlds lggvleasgy stevvalsrl qgslqdmlwq ldlspgc
> > > //
> > >
> > > _______________________________________________
> > > Biojava-l mailing list  -  Biojava-l@biojava.org
> > > http://biojava.org/mailman/listinfo/biojava-l
> > >
> >
> >
> >
> > _______________________________________________
> > Biojava-l mailing list  -  Biojava-l@biojava.org
> > http://biojava.org/mailman/listinfo/biojava-l
> >
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
> 
> _______________________________________________
> Biojava-l mailing list  -  Biojava-l@biojava.org
> http://biojava.org/mailman/listinfo/biojava-l
>