[Bioperl-l] Problems reading Genbank file
gert thijs
gert.thijs@esat.kuleuven.ac.be
Tue, 14 Nov 2000 20:40:08 +0100
This is a multi-part message in MIME format.
--------------1DA9235EBFDF878C6348EFEB
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Hello,
I am trying to read some sequences from a genbank flat file and store the
sequences in a hash with the accession number of the sequences as the key. But
when I want to print the accession number all I get is 'unknown'
Here is the code I use to read the genbank sequences and store them in a hash.
I have included the test file I use as an attachment to this mail.
# read all data from temporary gb flat file
$inStream = new Bio::SeqIO(-file => "<test.gb", -format => 'Genbank' );
%seqList = ();
while ( $seq = $inStream->next_seq() ){
$key = $seq->accession_number;
print "$key \n";
$seqList{$key} = $seq;
}
$inStream->close;
Thanx,
Gert Thijs
==========================================================
+ Gert Thijs gert.thijs@esat.kuleuven.ac.be +
+ +
+ Dept. Elektrotechniek ESAT-SISTA +
+ Kardinaal Mercierlaan, 94 +
+ B-3001 HEVERLEE Belgium +
+ Tel : +32-16-32 18 84 ---- Fax : +32-16-32 19 70 +
==========================================================
--------------1DA9235EBFDF878C6348EFEB
Content-Type: text/plain; charset=us-ascii;
name="test.gb"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="test.gb"
LOCUS AF016236 7990 bp DNA BCT 06-JAN-1998
DEFINITION Rhodobacter sphaeroides DMSO/TMAO-sensor kinase (dorS),
DMSO/TMAO-response regulator (dorR), DMSO/TMAO-cytochrome
c-containing subunit (dorC), DMSO-membrane protein (dorB), and
DMSO/TMAO-reductase (dorA) genes, complete cds.
ACCESSION AF016236
VERSION AF016236.1 GI:2353766
KEYWORDS .
SOURCE Rhodobacter sphaeroides.
ORGANISM Rhodobacter sphaeroides
Bacteria; Proteobacteria; alpha subdivision; Rhodobacter group;
Rhodobacter.
REFERENCE 1 (bases 1 to 7990)
AUTHORS Mouncey,N.J., Choudhary,M. and Kaplan,S.
TITLE Characterization of genes encoding dimethyl sulfoxide reductase of
Rhodobacter sphaeroides 2.4.1T: an essential metabolic gene
function encoded on chromosome II
JOURNAL J. Bacteriol. 179 (24), 7617-7624 (1997)
MEDLINE 98062189
REFERENCE 2 (bases 1 to 7990)
AUTHORS Mouncey,N.J., Choudhary,M. and Kaplan,S.
TITLE Direct Submission
JOURNAL Submitted (25-JUL-1997) Microbiology and Molecular Genetics,
University of Texas Medical School, 6431 Fannin, Houston, TX 77030,
USA
FEATURES Location/Qualifiers
source 1..7990
/organism="Rhodobacter sphaeroides"
/strain="2.4.1T"
/db_xref="taxon:1063"
/chromosome="2"
RBS 159..165
gene 175..2622
/gene="dorS"
CDS 175..2622
/gene="dorS"
/codon_start=1
/transl_table=11
/product="DMSO/TMAO-sensor kinase"
/protein_id="AAB94870.1"
/db_xref="GI:2353767"
/translation="MIAEKSERFFPFAVSAELAPVGVSAAERSALADYLESSETLLTE
RVVAYASTRSYSHLVSTLPEAWRSSVQGLTDSVILMLDHQSAEAAIDYDADIGTDPST
AYGIEAGLRHRLRGISLETFIGAFKGYRDVYLNVTAEARVPAAMREGWLRLLRGFFDR
AEIGICAHWSGELGTLDHDQLLSVNRALVNEKNKYLTIFESMNNPVLLVDDGGRIENM
NFAAARLFLQDALPGSVYYGPEANLRFADLAGFDLEAVKLRGEAGDVLTCIGERWYSI
TAQEMLDVSRKFVGIVVTFHDVTEARRAREQAEALARAKTDFLATMSHEIRTPIHSIG
GVTELLKQSELASRDRGYVDAIERSTEVLASIVSDVLDYARIESGLVELEQVDFSIDQ
ILDDVARMMQPLVRRKPQLRIVIERTDLPAGPGKMQASLRQILINLTSNAVKFTPEGT
VVIGAERLAGGHRFRFTVSDTGPGIAAEKLEEIFKPYIQSDSSISRRHGGTGLGLAIC
RRLAGHLGGRLDVRSTPGFGSRFTLEVALAPGGSEPPGDATGDAPPPARALDLLVVED
DEVNALVAQSLLSAAGHGVRVAGTGEAALVRPRRSTAFDLVLTDLNLPDMDGLELART
IRRHADRHTAELPLVALSAHGPGVDPAALTEAGIDAFLGKPFHFARLEEILSRLVGSS
SPTPLGKAAPARRALQSVDLCVLRGHAEALGRTSAARIVQTFRQSVLETARALELAMD
EADMRSVTSLAHRLKGAARHLGFRGLSDKAEQVETAAAADGCEAALVLELVSDCRAAP
ALADLAWAEASAGVAES"
gene complement(2640..3340)
/gene="dorR"
CDS complement(2640..3340)
/gene="dorR"
/codon_start=3
/transl_table=11
/product="DMSO/TMAO-response regulator"
/protein_id="AAB94871.1"
/db_xref="GI:2353768"
/translation="MKKNYHMLVVEDDPVSRQTLAMYLRKENHEVSEARDGEQMRRVF
PKGDVDVVMLDINMPGKDGLSILRELPRQSEVGIIMVTSRKEDVDRIVALEFGADDYV
TKPYNMREILPRAKNFARRVAALRLVRPDQPATTFDGWTLDAAHWALTDPAGNHVKLT
RAEFELLATFVAHPGQVLTRDQLMNHVGRRGHETFDRTIDVLVRRIRRKIEADPSDPR
LIVTVHGIGYVFQA"
RBS complement(3350..3355)
misc_feature 3422..3432
/note="putative DorR binding-site"
misc_feature 3433..3443
/note="putative DorR binding-site"
misc_feature 3454..3464
/note="putative DorR binding-site"
misc_feature 3476..3486
/note="putative DorR binding-site"
RBS 3535..3543
gene 3571..4785
/gene="dorC"
CDS 3571..4785
/gene="dorC"
/codon_start=1
/transl_table=11
/product="DMSO/TMAO-cytochrome c-containing subunit"
/protein_id="AAB94872.1"
/db_xref="GI:2353769"
/translation="MGRSRGRASEAKVISRIWKAFWRPSTKWGLGVLLVTGGIAGAVG
WNGFHYVVEKTTTTEFCISCHSMRDNNYEEYKTTIHYQNTSGVRAECADCHVPKSGWK
LYRAKLLAAKDLWGEIQGTIDTREKFEAHRLEMAETVWADMKANDSATCRTCHSFNAM
DFAHQKPEASKQMQQAMNEGGTCIDCHKGIAHKLPDMASGYRALFSKLEKASQSLKPS
KGETLYPLQTIEAYLERPSGDKAKGDGRLLAATPMQVVDVKGEWVQVAVKGWQQEGAE
RVIYEKQGKRIFNAALAPTATGSIVAGASMVDPDTEQTWTDVSLTAWVRNRDLTDDQE
ALWQYGKQMFNGACGMCHVLPHTEHFLANQWIGTLNAMKSRAPLDDEQFRLVQRYVQM
HAKDVEPEGAAE"
RBS 4767..4774
/gene="dorC"
gene 4782..5462
/gene="dorB"
CDS 4782..5462
/gene="dorB"
/codon_start=1
/transl_table=11
/product="DMSO-membrane protein"
/protein_id="AAB94873.1"
/db_xref="GI:2353770"
/translation="MTFAHSFPSAHMPVPAPAAGAGEIAPLCAWLAEVFIAPPSAPEI
GAYRRGEAAAWLASLAADPDFAPGAAAMRQALAGEGSDEALAARLGTAFNRLFLGFGG
RRTVVPCESAWRGNGRLYQAPAAEMQHLFARADLSLGAGCVEPPDHISVELALLSFLL
VSGDPGTSAMKERLQGWIPAFCARCLEEDTTGFWGGAARLLTAAVAACPARDEARQDR
HTEERKAR"
RBS 5443..5454
/gene="dorB"
gene 5459..7927
/gene="dorA"
CDS 5459..7927
/gene="dorA"
/codon_start=1
/transl_table=11
/product="DMSO/TMAO-reductase"
/protein_id="AAB94874.1"
/db_xref="GI:2353771"
/translation="MTKLSGQELHAELSRRAFLSYTAAVGALGLCGTSLLAQGARAEG
LANGEVMSGCHWGVFKARVENGRAVAFEPWDKDPAPSHQLPGVLDSIYSPTRIKYPMV
RREFLEKGVNADRSTRGNGDFVRVTWDEALDLVAKELKRVQESYGPTGTFGGSYGWKN
PGRLHNCQVLMRRALNLAGGFVNSSGDYSTGAAQIIMPHVMGTLEVYEQQTAWPVVVD
NTELMVFWAADPVKTNQIGWVVPDHGAFAGMQAMKEKGTKVICINPVRTETADYFGAE
LVSPRPQTDVALMLGMAHTLYSEDLHDKDFIENCTSGFDIFAAYLTGESDGTPKTAEW
AAEICGLPAEQIKELARRFVGGRTMLAAGWSIQRMHHGEQAHWMLVTLASMIGQIGLP
GGGFGLSYHYSNGGSPTSDGPALGGISDGGKPVEGAAWLSASGAASIPCARVVDMLLN
PGGEFQFNGATATYPDVKLAYWVGGNPFAHHQDRNRMLKAWEKLETFIVQDFQWTATA
RHADIVLPATTSYERNDIESVGDYSNRAILAMKKVVDPLYEARSDYDIFAALTERLGK
GKEFTEGRDEMGWISSFYEAAVKQAEFKQMEMPSFEDFWSEGIVEFPITEGANFVRYA
DFREDPLFNPLGTPSGLIEIYSKNIEKMGYDDCPAHPTWMEPAERLGGPGAKYPLHVV
ASHPNSRLHSQLNGTSLRDLYAVAGHEPCLINPDDAAARGIADGDVLRVFNDRGQILV
GAKVSDAVMPGAIQVYEGGWYDPLDPSEEGTLDKYGDVNVLSLDVGTSKLAQGNCGQT
ILADVEKYAGAPVTVTVFDTPKGP"
stem_loop 7932..7963
BASE COUNT 1345 a 2668 c 2681 g 1296 t
ORIGIN
1 tccgcatttg acgtcaatca aggattgtcc cgcattaacc tatcagatcg gccgagacgg
61 tctgccgcag tcgaaggcgg cggatcatgg agatggacgt gccgggcgcg cggaacgggc
121 aaggggctcg cgccccggag ccccacttca tgcgccttgg aaggagtttg gtcgatgata
181 gccgagaagt cggagcggtt cttccccttt gcggtcagtg cggaacttgc gcccgtgggc
241 gtctcggcgg ccgaacggag cgcgcttgcc gactatctgg agtcgagcga gacccttctg
301 accgaacgcg ttgtcgccta cgccagcacc cgcagctaca gccacctcgt ctcgacgctg
361 cccgaggcct ggcgctcgtc cgttcagggg ctgacggact ccgtcatcct catgctcgac
421 caccagtcgg ccgaagccgc catcgactat gacgcagata tcggcaccga tcccagcacc
481 gcctatggca tcgaggccgg tctccgccac cgcctgcggg gcatctcgct cgagaccttc
541 atcggcgcct tcaagggcta tcgcgatgtc tatctgaatg tcaccgccga ggcgcgggtt
601 cccgccgcga tgcgcgaggg gtggttgcgc ctcctgcggg gtttcttcga ccgggccgag
661 attgggatct gcgcccattg gagcggcgag ctgggcactc tcgatcacga ccagctgctc
721 tcggtgaacc gggcgctcgt caacgagaag aacaagtatc tgaccatctt cgaaagcatg
781 aacaatccgg tgctgctggt ggatgacggc gggcgcatcg agaacatgaa cttcgccgcc
841 gcgcgtctct tcctgcagga tgccctgccg ggctcggtct attacgggcc ggaggcgaac
901 ctcaggttcg ccgatctcgc gggtttcgac ctcgaggcgg tgaagctgcg cggcgaggcg
961 ggcgatgtgc tgacctgcat cggcgagcgc tggtactcga tcacggcgca ggagatgctg
1021 gacgtcagcc gcaagttcgt gggcatcgtg gtcaccttcc acgacgtgac cgaagcgcgg
1081 cgggcgcgcg aacaggccga ggcgctggcc cgcgccaaga ccgacttcct cgccacgatg
1141 agccacgaga tccgcacccc gatccacagc atcggggggg tcaccgaact tctcaagcag
1201 tccgagcttg cctcccgcga ccgcggctat gttgatgcga tcgagcggtc gaccgaggtg
1261 ctcgcctcga tcgtgagcga cgtgctcgat tacgcgcgga tcgagtccgg gctggtcgag
1321 ctcgagcagg tggatttctc gatcgaccag atcctcgacg atgtggcgcg gatgatgcag
1381 ccgctggtgc gccgcaagcc gcagcttcgc atcgtgatcg agcggacgga cctgcccgcc
1441 ggtcctggga agatgcaggc aagcttgcgg cagatcctca tcaatctcac gagcaacgcg
1501 gtgaagttca ccccggaggg aaccgttgtg atcggggccg agcgcctcgc cggcggccat
1561 cgcttccgct tcaccgtgag cgataccggg ccgggcatcg cggccgagaa gctcgaggag
1621 atcttcaaac cctatatcca gtccgacagc tcgatctcgc gccgccacgg cggcaccggc
1681 ctcggtctcg cgatctgccg gaggctcgcc ggacatctgg gggggcgcct cgacgtgcgc
1741 agcacgcccg gcttcggcag ccgcttcacg ctggaagtgg cgctcgctcc gggcgggagc
1801 gagcccccgg gcgatgcgac gggcgacgcg cctccgccgg cacgggcgct ggatctgctg
1861 gtggtcgagg atgacgaggt gaatgcgctg gtggcgcaga gcctgctctc ggctgccggc
1921 cacggcgtcc gggtcgccgg caccggcgag gcggcgctcg ttcgccctcg gcggagcacc
1981 gctttcgacc tcgtgctgac ggacctcaac ctgccggaca tggacgggct ggagctggcc
2041 cgcacgatcc gccgccacgc cgacaggcac acggcggagc tgccgttggt ggcgctctcc
2101 gcgcatggcc cgggtgtgga tccggcggcg ctgaccgagg cggggatcga cgccttcctg
2161 ggcaaaccct tccatttcgc gcgtcttgaa gagatcctct cccgtctggt cggaagttcc
2221 tcgcccacgc cgctgggcaa ggccgcgccg gcccggcgcg ccttgcagtc ggtggatctc
2281 tgcgtgctgc ggggtcatgc cgaggcgctg ggacgtacct ctgccgcacg gatcgtccag
2341 accttccggc agagcgtgct cgagacggcc cgtgcgctgg aactggcgat ggatgaggcc
2401 gacatgcggt ccgtgacgtc tctggcccac cggctgaagg gggcggcgcg gcatctgggc
2461 ttcaggggtc tctccgacaa ggcagagcag gtcgagaccg cggccgcggc ggacggctgc
2521 gaggccgctc tggtgctcga gctggtctcc gactgccgcg ccgctccggc gctggccgat
2581 ctcgcctggg ccgaagccag tgccggagtg gccgagagct gaggcggtct cttccggcgt
2641 caggcctgga agacgtagcc gatgccgtga acggtcacga tcagccgcgg gtcggaaggg
2701 tccgcctcga tcttgcggcg gatgcggcgc actagcacgt cgatggtccg gtcgaaggtc
2761 tcgtgcccgc ggcggccgac atggttcatc agctggtcgc gggtcaggac ctgcccggga
2821 tgggccacga aggtggccag cagctcgaac tcggcccgtg tcagtttgac atgattgccc
2881 gcgggatcgg tcagcgccca atgggccgcg tcgagggtcc agccgtcgaa ggtcgtcgcc
2941 ggctggtccg gccgcaccag ccgcagggcc gccacccgcc gggcgaaatt ctttgcccgc
3001 ggcaggatct cgcgcatgtt gtatggcttg gtcacataat cgtccgcgcc gaactccagc
3061 gccacgatcc ggtccacatc ctccttccgg ctcgtcacca tgatgatgcc gacctcggac
3121 tgccggggca gttcccgcag aattgacagc ccgtccttgc ccggcatgtt gatgtcgagc
3181 atcaccacat ccacgtcgcc cttggggaag acgcggcgca tttgttcgcc gtcgcgcgct
3241 tcgctgactt cgtgattttc cttgcgcaga tacatcgcga gcgtctggcg gctgaccgga
3301 tcgtcttcga cgacaagcat gtggtagttt ttcttcatga cgcgcgaggt ctcctgcggc
3361 cggttggacc taatgcaccc tttcgcgccc cgatttcaac ggcaactcat tcacttggcc
3421 gctgttaaca tcctgttcac atcattttac gccaggttaa caatctgacg caacgcggtt
3481 cacaccgctc ctccaccttg gctttcaaca gaggcagcaa gccggtggac cttcggggaa
3541 ggaccggcgc gcccgccgca ttcctgcggc atggggcgtt ctcgcggtcg ggcttcggag
3601 gcaaaagtga tcagcaggat ttggaaggct ttctggcgac cgagcacgaa atgggggctc
3661 ggcgtcctgc tcgtgaccgg cggcatcgcc ggtgcggtcg gatggaacgg gttccactat
3721 gtggtggaaa agaccaccac gacggaattc tgcatcagct gccactcgat gcgggacaac
3781 aactacgagg aatacaagac caccatccac taccagaaca cctcgggcgt gcgggcggaa
3841 tgcgccgact gtcacgtccc gaaatccggc tggaagctct accgcgcgaa gctcctcgcc
3901 gcgaaggacc tctggggcga aattcagggc accatcgaca cgcgtgagaa gttcgaggcg
3961 caccggctcg agatggccga gaccgtctgg gccgacatga aggccaacga ctcggccacc
4021 tgccggacct gccactcgtt caacgcgatg gacttcgccc accagaagcc cgaggcctcg
4081 aagcagatgc agcaggcgat gaacgagggc ggaacctgca tcgactgcca caagggcatc
4141 gcccacaagc tgcccgacat ggccagcggc taccgcgcgc tgttctcgaa gctcgagaag
4201 gcctcgcagt cgctcaagcc cagcaagggc gagacgctct atccgctcca gaccatcgag
4261 gcctatctcg agcggccctc gggcgacaag gcgaagggcg acgggcggct tctggccgcg
4321 acgccgatgc aggtggtcga cgtgaagggt gagtgggtgc aggtcgcggt gaagggctgg
4381 cagcaggaag gcgccgagcg ggtcatctac gagaagcagg gcaagcggat tttcaacgcc
4441 gcactggcgc cgacggccac gggctcgatc gtggcgggcg cgtccatggt cgatccggac
4501 accgaacaga cctggacgga tgtctcgctg acggcgtggg tgcgcaaccg cgacctgaca
4561 gacgaccagg aagcgctctg gcagtatggc aagcagatgt tcaacggtgc ctgcggcatg
4621 tgtcacgtcc tgccccacac cgagcatttc ctcgccaacc agtggatcgg cacgctcaac
4681 gccatgaaga gccgggcgcc gctcgatgac gaacagttcc gcctcgtgca gcgctacgtc
4741 cagatgcatg cgaaggacgt ggaaccggaa ggagctgcgg aatgaccttc gcgcattcct
4801 tccccagcgc ccacatgccc gtcccggcgc ctgccgccgg ggccggcgag atcgccccgc
4861 tctgtgcctg gctggccgaa gtgttcatcg ccccgccgtc ggcccccgag atcggcgcct
4921 atcgccgcgg ggaagccgcg gcctggttgg ccagccttgc ggccgacccc gacttcgccc
4981 ccggcgccgc cgccatgcgg caggcgctgg ccggggaggg cagcgacgaa gccctcgcag
5041 cccggctcgg gacggccttc aaccggctgt tcctcggctt cggcggccgc cgcacggtgg
5101 tgccgtgcga atccgcctgg cggggaaacg ggcggcttta tcaggccccg gcggccgaga
5161 tgcagcatct cttcgcccgg gccgaccttt cgctcggcgc aggctgcgtc gagccgcccg
5221 accacatctc ggtcgagctc gcgctcctgt ccttcctgct cgtgagcggg gatcccggca
5281 ctagcgccat gaaagaacgc ctgcagggct ggatcccggc cttctgcgca cgttgcctcg
5341 aagaggatac gacgggcttc tggggaggcg ccgcgcgtct cctgaccgcc gcggtggccg
5401 catgccccgc ccgggacgaa gcccggcaag accgtcatac ggaagaaagg aaagccagat
5461 gactaagttg tcaggtcagg agctgcatgc cgaactctcg cggcgcgcct tcctgagcta
5521 tacggcggct gtgggggctc tcggtctctg cggcacctcg ctcctcgcgc agggagcccg
5581 cgcggaaggt ctcgccaacg gcgaggtcat gtcgggctgc cactggggcg tgttcaaggc
5641 ccgggtcgag aacggccgcg ccgtggcctt cgagccctgg gacaaggacc ccgcgccgtc
5701 gcaccagctg ccgggcgtgc tcgattcgat ctattcgccc acgcggatca aatatccgat
5761 ggtgcgccgc gaattcctcg agaagggcgt gaacgccgac cgctccaccc gcggcaacgg
5821 cgacttcgtc cgcgtcacct gggatgaagc gctcgacctc gtggccaagg aactgaagcg
5881 cgttcaggaa agctacgggc ccaccggcac cttcggcggc tcctacggct ggaaaaaccc
5941 gggccggctg cacaactgtc aggtcctcat gcgccgcgcg ctgaatctcg cgggcgggtt
6001 cgtgaactcg tcgggcgact attcgaccgg cgccgcgcag atcatcatgc cgcatgtcat
6061 gggcacgctc gaggtctacg agcagcagac cgcctggccc gtggtggtgg acaacaccga
6121 actgatggtc ttctgggccg ccgatccggt gaagaccaac cagatcggct gggtggtccc
6181 cgaccatggc gccttcgcgg gcatgcaggc aatgaaggaa aagggcacca aggtcatctg
6241 catcaacccc gtgcgcaccg agacggccga ctatttcggc gccgaactcg tgtcgccgcg
6301 gccgcagacc gacgtggcgc tgatgctcgg catggcgcac acgctctaca gcgaagatct
6361 gcacgacaag gacttcatcg aaaactgcac ctcgggcttc gacatcttcg cggcctacct
6421 gaccggcgag agcgacggca cgcccaagac ggccgaatgg gccgccgaga tctgcggcct
6481 gccggccgag cagatcaagg aactcgcccg ccgcttcgtg ggcggccgga cgatgctcgc
6541 cgcgggctgg tcgatccagc ggatgcacca tggcgaacag gcgcactgga tgctcgtcac
6601 gctggcctcg atgatcggcc agatcggtct tccgggcggc ggcttcggcc ttagctacca
6661 ttactccaac ggtggctcgc ccacgagcga cggcccggcg ctgggcggta tttcggacgg
6721 cggcaagccg gtcgaaggtg cggcctggct gtcggcgagc ggcgcggctt cgatcccctg
6781 cgcccgggtg gtggacatgc tgctcaatcc gggcggcgag ttccagttca acggtgccac
6841 ggcgacctat cccgacgtga agctggccta ctgggtgggc ggcaacccct tcgcgcacca
6901 ccaggaccgc aaccggatgc tcaaggcctg ggaaaagctc gagaccttca tcgtgcagga
6961 cttccagtgg accgccaccg cgcgccacgc cgacatcgtc ctgccggcga cgacctccta
7021 cgaacgcaac gacatcgagt cggtgggcga ctattcgaac cgcgccatcc tcgcgatgaa
7081 gaaggtggtc gatccgctct acgaggcccg gtcggactac gacatcttcg cagccctgac
7141 ggagcgtctg ggcaagggca aggaattcac cgaaggccgc gacgagatgg gctggatcag
7201 ctcgttctac gaggcggcgg tgaagcaggc cgagttcaag cagatggaga tgccgtcgtt
7261 cgaggacttc tggtcggaag ggatcgtcga gttcccgatc accgagggcg cgaacttcgt
7321 tcgctatgcc gacttccgcg aggatccgct gttcaacccc ctcggcacgc cctcgggcct
7381 gatcgagatc tactcgaaga acatcgagaa gatgggctat gacgattgcc cggcccatcc
7441 gacctggatg gaaccggccg agcgtctcgg cgggccgggg gcgaaatatc cgctccatgt
7501 ggtggcgagc cacccgaact cgcggctgca ctcgcagctg aacggcacct cgctgcgcga
7561 cctctatgcg gtggcggggc acgagccctg tctcatcaac cccgacgatg cggccgcgcg
7621 cggcatcgcg gacggcgatg tgctgcgggt gttcaacgac cgcgggcaga tcctcgtggg
7681 cgcgaaggtg agcgacgcgg tgatgccggg cgcgatccag gtctacgagg gcggctggta
7741 cgacccgctc gacccctcgg aggaaggcac gctcgacaaa tacggcgacg tgaacgtgct
7801 gtcgctcgac gtcggcacct cgaagctggc gcagggcaac tgcggccaga ccatcctcgc
7861 ggatgtcgaa aaatatgcgg gcgcgccggt gacggtgacc gtgttcgaca cgccgaaggg
7921 accctgaggc gccccggccg gggcggcggt tcccccgccc gccttcacct tccccggccc
7981 gcaccgcttg
//
LOCUS AF057044 2300 bp mRNA PLN 15-APR-1998
DEFINITION Arabidopsis thaliana acyl-CoA oxidase (ACX1) mRNA, complete cds.
ACCESSION AF057044
VERSION AF057044.1 GI:3044213
KEYWORDS .
SOURCE thale cress.
ORGANISM Arabidopsis thaliana
Eukaryota; Viridiplantae; Embryophyta; Tracheophyta; Spermatophyta;
Magnoliophyta; eudicotyledons; core eudicots; Rosidae; eurosids II;
Brassicales; Brassicaceae; Arabidopsis.
REFERENCE 1 (bases 1 to 2300)
AUTHORS Hooks,M.A., Kellas,F. and Graham,I.A.
TITLE An acyl-CoA oxidase gene of Arabidopsis thaliana
JOURNAL Unpublished
REFERENCE 2 (bases 1 to 2300)
AUTHORS Hooks,M.A., Kellas,F. and Graham,I.A.
TITLE Direct Submission
JOURNAL Submitted (02-APR-1998) Division of Biochemistry and Molecular
Biology, University of Glasgow, University Ave., Glasgow G12 8QQ,
United Kingdom
FEATURES Location/Qualifiers
source 1..2300
/organism="Arabidopsis thaliana"
/cultivar="Columbia"
/db_xref="taxon:3702"
/tissue_type="seedling hypocotyl"
/clone_lib="CD4-15: Keiber"
/dev_stage="3 days old"
gene 1..2300
/gene="ACX1"
CDS 77..2071
/gene="ACX1"
/EC_number="1.3.3.6"
/codon_start=1
/product="acyl-CoA oxidase"
/protein_id="AAC13498.1"
/db_xref="GI:3044214"
/translation="MEGIDHLADERNKAEFDVEDMKIVWAGSRHAFEVSDRIARLVAS
DPVFEKSNRARLSRKELFKSTLRKCAHAFKRIIELRLNEEEAGRLRHFIDQPAYVDLH
WGMFVPAIKGQGTEEQQKKWLSLANKMQIIGCYAQTELGHGSNVQGLETTATFDPKTD
EFVIHTPTQTASKWWPGGLGKVSTHAVVYARLITNGKDYGIHGFIVQLRSLEDHSPLP
NITVGDIGTKMGNGAYNSMDNGFLMFDHVRIPRDQMLMRLSKVTREGEYVPSDVPKQL
VYGTMVYVRQTIVADASNALSRAVCIATRYSAVRRQFGAHNGGIETQVIDYKTQQNRL
FPLLASAYAFRFVGEWLKWLYTDVTERLAASDFATLPEAHACTAGLKSLTTTATADGI
EECRKLCGGHGYLWCSGLPELFAVYVPACTYEGDNVVLQLQVARFLMKTVAQLGSGKV
PVGTTAYMGRAAHLLQCRSGVQKAEDWLNPDVVLEAFEARALRMAVTCAKNLSKFENQ
EQGFQELLADLVEAAIAHCQLIVVSKFIAKLEQDIGGKGVKKQLNNLCYIYALYLLHK
HLGDFLSTNCITPKQASLANDQLRSLYTQVRPNAVALVDAFNYTDHYLNSVLGRYDGN
VYPKLFEEALKDPLNDSVVPDGYQEYLRPVLQQQLRTARL"
misc_feature 2060..2068
/gene="ACX1"
/note="putative targeting signal to peroxisomes"
BASE COUNT 592 a 476 c 561 g 671 t
ORIGIN
1 tttttttcct atcatctctg agagttttct cgagaaactt ttgagtgttt agctactaga
61 ttctgaatta cgaatcatgg aaggaattga tcacctcgcc gatgagagaa acaaagcaga
121 gttcgacgtt gaggatatga agatcgtctg ggctggttcc cgccacgctt ttgaggtttc
181 cgatcgaatt gcccgccttg tcgccagcga tccggtgttt gagaaaagca atcgagctcg
241 gttgagtagg aaggagctgt ttaagagtac gttgagaaaa tgtgcccatg cgtttaaaag
301 gattatcgag cttcgtctca atgaggaaga agcaggaaga ttgaggcact ttatcgacca
361 gcctgcctat gtggatctgc actggggaat gtttgtgcct gctattaagg ggcagggtac
421 agaggagcag cagaagaagt ggttgtcgct ggccaataag atgcagatta ttgggtgtta
481 tgcacagact gagcttggtc atggctcaaa tgttcaagga cttgagacaa ctgccacatt
541 tgatcccaag actgatgagt ttgtaattca cactccaact cagactgcat ccaaatggtg
601 gcctggtggt ttgggaaaag tttctactca tgctgttgtt tacgctcgtc tcataactaa
661 cggaaaagac tacggtatcc atggattcat cgtgcaactg cgaagcttag aagatcattc
721 tcctcttccg aatataactg ttggtgatat cgggacaaag atgggaaatg gagcatataa
781 ttcaatggac aacgggtttc ttatgtttga tcatgttcgc attcctagag atcaaatgct
841 catgaggctg tcaaaagtta caagagaagg agaatatgtt ccatcggatg ttccaaagca
901 gctggtatat ggtactatgg tgtatgtgag acaaacaatt gtggctgatg cttccaatgc
961 actatctcga gcagtttgca tagctacaag atacagtgca gtgcggaggc aatttggcgc
1021 acataatggt ggcattgaga cacaggtgat tgattataaa actcagcaga acaggctatt
1081 tcctctgcta gcatctgcat atgcatttcg atttgttgga gagtggctaa aatggctgta
1141 cacggatgta actgaaagac tggcggctag tgatttcgca actttgcctg aggctcatgc
1201 atgcactgca ggattgaagt ctctcaccac cacagccact gcggatggca ttgaagaatg
1261 tcgtaagtta tgtggtggac atggatactt gtggtgcagt gggctccccg agctgtttgc
1321 tgtatatgtt cctgcctgca catacgaagg agacaatgtt gtgctgcaat tacaggttgc
1381 tcgattcctc atgaagacag tcgcccagct gggatctgga aaggttcctg ttggcacaac
1441 tgcttatatg ggccgggcag cacatctttt gcaatgtcgt tctggtgttc aaaaggctga
1501 ggattggtta aaccctgatg ttgtactgga agctttcgaa gctagggctc tcagaatggc
1561 tgttacgtgt gccaaaaatc tcagcaagtt tgagaatcag gaacaaggat tccaagagct
1621 cttggctgat ttggttgagg ccgctattgc tcattgccaa ttgattgttg tttccaagtt
1681 catagcgaaa ctggagcaag acataggtgg caaaggagtg aagaaacagc tgaataatct
1741 gtgttacatt tatgctcttt atctcctcca caaacatctc ggcgatttcc tctccactaa
1801 ctgcatcact cccaaacaag cctctcttgc taacgaccag ctccgttcct tatacactca
1861 ggtccggcct aatgcggttg cacttgtgga cgccttcaat tacaccgacc attacttgaa
1921 ctcggttctt ggccgttacg acggtaatgt gtacccaaag ctctttgagg aagcgttgaa
1981 ggatccattg aacgactcgg tggttcctga tgggtaccaa gaataccttc gacctgtgct
2041 tcagcagcaa cttcgtaccg ctaggctctg aagagttttc tttgcttgat actcgatatg
2101 gttaatcaca ttagacttgc ttcgtccttc ttcttcgtct tcttcttctt ctcgctttga
2161 ataatttcgc agtttaaaaa ctggcgatgc ccttatttat atgtagcaat gtaatagtta
2221 atgtacgatc gtcatatggc ggaattttag tactattttt cgttttcaat gcaacattaa
2281 tacaattgat cgtttctact
//
--------------1DA9235EBFDF878C6348EFEB--