AW: [Bioperl-l] xml sequence download from ncbi
Sigmund, Ralf
Ralf.Sigmund@MPIHAN.MPG.de
Tue, 5 Sep 2000 13:10:25 +0200
Hi!
I have been toying around with this.
I compared what I get when I query Genbank with the gi 3095101
The genbank result starts with:
LOCUS AF043257 2981 bp mRNA ROD 05-MAY-1998
DEFINITION Mus musculus beta5B integrin mRNA, complete cds.
ACCESSION AF043257
VERSION AF043257.1 GI:3095101
but the downloaded Sequence in XML format does not include the DEFINITION
data.
The Object ID is tmpseq_1 and there is no way to find out, that this entry
represents an integrin mRNA-
Now I wonder if this due to the asn.1 format the XML output is based on or
if this is due to an singular inconsistence in the database data?
I append the XML result i got from:
http://www.ncbi.nlm.nih.gov/entrez/viewer.cgi?cmd&save=on&view=xml&val=30951
01
Thanks for Your Help!
Ralf
<?xml version="1.0"?>
<!--DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN" "NCBI_Seqset.dtd"-->
<Seq-entry>
<Seq-entry_seq>
<Bioseq>
<Bioseq_id>
<Seq-id>
<Seq-id_local>
<Object-id>
<Object-id_str>tmpseq_1</Object-id_str>
</Object-id>
</Seq-id_local>
</Seq-id>
<Seq-id>
<Seq-id_genbank>
<Textseq-id>
<Textseq-id_name>AF043257</Textseq-id_name>
<Textseq-id_accession>AF043257</Textseq-id_accession>
<Textseq-id_version>1</Textseq-id_version>
</Textseq-id>
</Seq-id_genbank>
</Seq-id>
<Seq-id>
<Seq-id_gi>3095101</Seq-id_gi>
</Seq-id>
</Bioseq_id>
<Bioseq_descr>
<Seq-descr>
<Seqdesc>
<Seqdesc_molinfo>
<MolInfo>
<MolInfo_biomol value="mRNA">3</MolInfo_biomol>
<MolInfo_completeness
value="complete">1</MolInfo_completeness>
</MolInfo>
</Seqdesc_molinfo>
</Seqdesc>
<Seqdesc>
<Seqdesc_update-date>
<Date>
<Date_std>
<Date-std>
<Date-std_year>1998</Date-std_year>
<Date-std_month>5</Date-std_month>
<Date-std_day>5</Date-std_day>
</Date-std>
</Date_std>
</Date>
</Seqdesc_update-date>
</Seqdesc>
</Seq-descr>
</Bioseq_descr>
<Bioseq_inst>
<Seq-inst>
<Seq-inst_repr value="raw"/>
<Seq-inst_mol value="rna"/>
<Seq-inst_length>2981</Seq-inst_length>
<Seq-inst_strand value="ss"/>
<Seq-inst_seq-data>
<Seq-data>
<Seq-data_iupacna>
<IUPACna>GGGGGCTCGGCGAGGTGCGTCCGGAGCAGCGACAACTCCGAGCGTCCCAGCGGGCCAGCGAGGAGGA
TGGTGGCGGCCGGGCGCGGACCAGCCCGGCCGCGGGCGCCGTGAGCCGGAGCGCAGCGCCCGGCATGCGGCTGCGG
TCCCCGGCCTCGGCCCCGCTCCGCCCCCGCCGAGCGCCCCAGCCGAGCGGCGCGCATCATGCCGCGGGTGCCCGCG
ACCCTCTACGCCTGTCTGCTCGGGCTCTGCGCGCTCGTTCCGCGCCTCGCAGGGCTCAACATATGCACTAGTGGAA
GTGCCACCTCGTGTGAAGAATGCCTGTTGATCCACCCAAAATGTGCCTGGTGCTCCAAAGAGTACTTTGGCAATCC
ACGGTCCATCACCTCTCGGTGTGACCTGAAGGCAAACCTCATCCGGAATGGCTGTGAAGGTGAGATTGAGAGTCCA
GCCAGCAGCACCCACGTCCTCCGGAACCTACCTCTCAGCAGCAAGGGTTCCAGTGCCACGGGCTCTGACGTCATCC
AGATGACGCCGCAGGAGATTGCAGTGAGCCTCCGGCCAGGCGAGCAGACTACGTTCCAGCTGCAGGTGCGCCAGGT
GGAGGACTACCCTGTAGACCTGTACTACCTGATGGACCTCTCCCTCTCCATGAAGGATGACTTGGAGAACATCCGG
AGCCTGGGCACCAAGCTTGCGGAGGAAATGAGGAAGCTCACTAGTAACTTCCGCTTAGGTTTCGGGTCTTTTGTTG
ACAAGGACATCTCTCCTTTCTCCTACACGGCACCGAGATACCAGACCAATCCGTGTATTGGTTACAAGTTATTCCC
CAACTGCGTCCCCTCCTTCGGGTTCCGGCATCTGCTGCCTCTCACAGACAGAGTCGACAGCTTCAACGAGGAAGTG
AGGAAGCAGAGGGTGTCCCGGAACCGAGATGCCCCCGAGGGGGGGTTTGATGCGGTCCTCCAGGCTGCTGTCTGCA
AGGAGAAGATCGGATGGCGAAAAGATGCTCTGCACTTGCTGGTGTTCACAACAGACGATGTGCCCCACATCGCACT
GGATGGAAAACTGGGTGGCCTGGTCCAGCCCCACGATGGCCAGTGTCACCTGAATGAAGCCAATGAGTACACAGCC
TCTAACCAGATGGACTATCCATCGCTTGCCTTGCTTGGGGAGAAGCTGGCAGAGAACAATATCAACCTCATTTTTG
CTGTGACGAAGAACCACTATATGCTCTACAAGAATTTTACAGCCCTGATACCTGGAACCACTGTGGAGATTTTGCA
TGGAGATTCCAAAAATATTATTCAACTGATTATCAATGCGTACAGTAGCATCCGGGCTAAAGTGGAGCTGTCAGTG
TGGGATCAGCCAGAAGACCTTAATCTCTTCTTCACTGCCACCTGCCAAGATGGCATATCTTACCCTGGTCAGAGGA
AGTGTGAGGGTCTGAAGATTGGGGACACGGCATCCTTTGAAGTGTCCGTGGAGGCTCGGAGCTGCCCCGGCAGACA
AGCAGCACAGTCTTTCACCTTGAGGCCCGTGGGCTTCCGGGACAGTCTGCAGGTGGAAGTCGCCTACAATTGCACA
TGCGGCTGTAGCACGGGGCTGGAGCCCAACAGTGCCAGATGCAGTGGGAATGGAACATACACCTGTGGGCTGTGCG
AGTGTGACCCCGGCTACCTGGGCACTAGGTGCGAGTGCCAGGAGGGGGAGAACCAGAGCGGGTACCAGAACCTGTG
CCGGGAGGCAGAGGGCAAGCCTCTGTGCAGCGGGCGTGGAGAGTGTAGCTGCAACCAGTGCTCCTGCTTCGAGAGT
GAGTTCGGGAGGATCTACGGACCTTTCTGCGAGTGTGACAGCTTTTCCTGTGCCAGAAACAAGGGCGTCCTATGCT
CAGGCCATGGAGAGTGTCACTGTGGAGAATGCAAATGCCACGCAGGTTACATTGGGGACAATTGTAACTGCTCAAC
AGACGTCAGCACATGCAAGGCCAAGGATGGGCAGATCTGCAGTGACCGAGGCCGTTGTGTCTGTGGGCAGTGCCAG
TGCACAGAGCCTGGAGCCTTTGGGGAGACGTGTGAGAAGTGCCCAACCTGCCCGGATGCTTGCAGCTCTAAGAGAG
ACTGTGTCGAATGCTTGCTACTTCACCAGGGGAAACCTGACAACCAGACCTGCCACCACCAGTGCAAAGATGAGGT
GATCACGTGGGTAGACACCATCGTCAAAGATGACCAGGAGGCTGTGCTTTGCTTCTACAAAACTGCTAAGGACTGC
GTTATGATGTTCAGCTACACAGAACTGCCCAATGGGAGGTCCAACTTGACGGTCCTCCGGGAGCCAGAATGTGGAA
GTGCCCCCAATGCCATGACCATCCTGCTGGCTGTGGTTGGCAGCATCCTCCTGATTGGGATGGCACTCCTGGCCAT
CTGGAAGCTGCTCGTCACCATCCACGACCGCCGAGAGTTTGCCAAGTTCCAAAGCCTCAAACCCCCTGTACAGAAA
GCCCATCTCCACACACACTGTCGATTTCGCCTTCAACAAGTTCAACAAATCCTACAATGGCTCAGTGGACTGAGGC
TCCTGGATGGCTGGAGGGGGACTAAGGATGAAGACTCTGGCGTGCCTTGGACTTCCTGGACCATTTGCTCACGCTA
GCTAGGCACGCACGGATAATGGAGATGCCCTCCATTGAGCCCTAAGGGACCTGGTAGCCACACAGCGGGCCACAGG
CACTTGGGGCCACTTCCCTCCAAGCCAGGGAAAGCAAGGAGACTCTGGTGTTCTCAGCTTCCCCTCTGCCGCCTCC
AGCTTGCTGTCTCCATGAACCTCTGAAGGCCTGGCTGCCCTCTTCCCTGCTGGGCCAGACAAGAAGGTATCCGGAA
GAGTCTGTGTGTACAAAGCTAGCGCGCAGCCTGGCTTTTTCCAGTTGATCGTTTTTTTTTCTATGAAATAAAAAGG
TCACGCATTTAAAAAAAAAAAAAAAA</IUPACna>
</Seq-data_iupacna>
</Seq-data>
</Seq-inst_seq-data>
</Seq-inst>
</Bioseq_inst>
</Bioseq>
</Seq-entry_seq>
</Seq-entry>
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l