[Bioperl-l] GenBank gene field

Jason Stajich jason.stajich at duke.edu
Fri Jan 21 22:23:51 EST 2005


You should probably ask the data providers....

On Jan 21, 2005, at 7:21 PM, Alexander Kozik wrote:

> Please take a look on two sample records from GenBank files 
> (Arabidopsis and C.elegans)
> C.elegans file has "/gene" entries for both "gene" and "CDS" fields. 
> Arabidopsis file has no "/gene" entries at all.
> Previous version of Arabidopsis GenBank file was with "/gene" entries.
> Could you help to understand why it happens and what entry you suggest 
> to extract if user is interested in extraction of corresponding gene 
> names.
> Do I use terms "entry" and "field" properly?
>
> Thanks a lot in advance,
>
> Alexander Kozik
> Bioinformatics Specialist
> Genome and Biomedical Sciences Facility
> 451 East Health Sciences Drive
> University of California
> Davis, CA 95616-8816
> Phone: (530) 754-9127
> email: akozik at atgc.org
> web: http://www.atgc.org/
>
> ----
>
> Arabidopsis GenBank file NC_003070.gbk:
>
>     gene            complement(38753..40944)
>                     /locus_tag="At1g01070"
>                     /note="synonym: T25K16.7; nodulin MtN21 family 
> protein"
>                     /db_xref="GeneID:839550"
> ...
>     CDS             
> complement(join(38898..39054,39136..39287,39409..39814,
>                     40213..40329,40473..40535,40675..40877))
>                     /locus_tag="At1g01070"
>                     /note="similar to MtN21 GI:2598575 (root nodule
>                     development) from [Medicago truncatula]"
>                     /codon_start=1
>                     /protein_id="NP_563617.1"
>                     /db_xref="GI:18378792"
>                     /db_xref="GeneID:839550"
>                     /translation="MAG...
> ----
>
> C.elegans GenBank file NC_003279.gbk:
>
>     gene            43733..44677
>                     /gene="1A519"
>                     /locus_tag="1A519"
>                     /synonym="Y74C9A.1"
>                     /note="Title: Caenorhabditis elegans expressed gene
>                     1A519."
> ...
>     CDS             
> join(43733..43961,44030..44234,44281..44328,44521..44677)
>                     /gene="1A519"
>                     /locus_tag="1A519"
>                     /codon_start=1
>                     /product="putative protein (1A519)"
>                     /protein_id="17510627"
>                     /db_xref="GI:17510627"
> ...
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/



More information about the Bioperl-l mailing list