AW: [Bioperl-l] xml sequence download from ncbi

Sigmund, Ralf Ralf.Sigmund@MPIHAN.MPG.de
Tue, 5 Sep 2000 13:10:25 +0200


Hi!
I have been toying around with this.
I compared what I get when I query Genbank with the gi 3095101

The genbank result starts with:
 LOCUS       AF043257     2981 bp    mRNA            ROD       05-MAY-1998
 DEFINITION  Mus musculus beta5B integrin mRNA, complete cds.
 ACCESSION   AF043257
 VERSION     AF043257.1  GI:3095101

but the downloaded Sequence in XML format does not include the DEFINITION
data.
The Object ID is tmpseq_1 and there is no way to find out, that this entry
represents an integrin mRNA-
Now I wonder if this due to the asn.1 format the XML output is based on or
if this is due to an singular inconsistence in the database data?
I append the XML result i got from:
http://www.ncbi.nlm.nih.gov/entrez/viewer.cgi?cmd&save=on&view=xml&val=30951
01
Thanks for Your Help!
Ralf

<?xml version="1.0"?>
<!--DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN" "NCBI_Seqset.dtd"-->
<Seq-entry>
  <Seq-entry_seq>
    <Bioseq>
      <Bioseq_id>
        <Seq-id>
          <Seq-id_local>
            <Object-id>
              <Object-id_str>tmpseq_1</Object-id_str>
            </Object-id>
          </Seq-id_local>
        </Seq-id>
        <Seq-id>
          <Seq-id_genbank>
            <Textseq-id>
              <Textseq-id_name>AF043257</Textseq-id_name>
              <Textseq-id_accession>AF043257</Textseq-id_accession>
              <Textseq-id_version>1</Textseq-id_version>
            </Textseq-id>
          </Seq-id_genbank>
        </Seq-id>
        <Seq-id>
          <Seq-id_gi>3095101</Seq-id_gi>
        </Seq-id>
      </Bioseq_id>
      <Bioseq_descr>
        <Seq-descr>
          <Seqdesc>
            <Seqdesc_molinfo>
              <MolInfo>
                <MolInfo_biomol value="mRNA">3</MolInfo_biomol>
                <MolInfo_completeness
value="complete">1</MolInfo_completeness>
              </MolInfo>
            </Seqdesc_molinfo>
          </Seqdesc>
          <Seqdesc>
            <Seqdesc_update-date>
              <Date>
                <Date_std>
                  <Date-std>
                    <Date-std_year>1998</Date-std_year>
                    <Date-std_month>5</Date-std_month>
                    <Date-std_day>5</Date-std_day>
                  </Date-std>
                </Date_std>
              </Date>
            </Seqdesc_update-date>
          </Seqdesc>
        </Seq-descr>
      </Bioseq_descr>
      <Bioseq_inst>
        <Seq-inst>
          <Seq-inst_repr value="raw"/>
          <Seq-inst_mol value="rna"/>
          <Seq-inst_length>2981</Seq-inst_length>
          <Seq-inst_strand value="ss"/>
          <Seq-inst_seq-data>
            <Seq-data>
              <Seq-data_iupacna>
 
<IUPACna>GGGGGCTCGGCGAGGTGCGTCCGGAGCAGCGACAACTCCGAGCGTCCCAGCGGGCCAGCGAGGAGGA
TGGTGGCGGCCGGGCGCGGACCAGCCCGGCCGCGGGCGCCGTGAGCCGGAGCGCAGCGCCCGGCATGCGGCTGCGG
TCCCCGGCCTCGGCCCCGCTCCGCCCCCGCCGAGCGCCCCAGCCGAGCGGCGCGCATCATGCCGCGGGTGCCCGCG
ACCCTCTACGCCTGTCTGCTCGGGCTCTGCGCGCTCGTTCCGCGCCTCGCAGGGCTCAACATATGCACTAGTGGAA
GTGCCACCTCGTGTGAAGAATGCCTGTTGATCCACCCAAAATGTGCCTGGTGCTCCAAAGAGTACTTTGGCAATCC
ACGGTCCATCACCTCTCGGTGTGACCTGAAGGCAAACCTCATCCGGAATGGCTGTGAAGGTGAGATTGAGAGTCCA
GCCAGCAGCACCCACGTCCTCCGGAACCTACCTCTCAGCAGCAAGGGTTCCAGTGCCACGGGCTCTGACGTCATCC
AGATGACGCCGCAGGAGATTGCAGTGAGCCTCCGGCCAGGCGAGCAGACTACGTTCCAGCTGCAGGTGCGCCAGGT
GGAGGACTACCCTGTAGACCTGTACTACCTGATGGACCTCTCCCTCTCCATGAAGGATGACTTGGAGAACATCCGG
AGCCTGGGCACCAAGCTTGCGGAGGAAATGAGGAAGCTCACTAGTAACTTCCGCTTAGGTTTCGGGTCTTTTGTTG
ACAAGGACATCTCTCCTTTCTCCTACACGGCACCGAGATACCAGACCAATCCGTGTATTGGTTACAAGTTATTCCC
CAACTGCGTCCCCTCCTTCGGGTTCCGGCATCTGCTGCCTCTCACAGACAGAGTCGACAGCTTCAACGAGGAAGTG
AGGAAGCAGAGGGTGTCCCGGAACCGAGATGCCCCCGAGGGGGGGTTTGATGCGGTCCTCCAGGCTGCTGTCTGCA
AGGAGAAGATCGGATGGCGAAAAGATGCTCTGCACTTGCTGGTGTTCACAACAGACGATGTGCCCCACATCGCACT
GGATGGAAAACTGGGTGGCCTGGTCCAGCCCCACGATGGCCAGTGTCACCTGAATGAAGCCAATGAGTACACAGCC
TCTAACCAGATGGACTATCCATCGCTTGCCTTGCTTGGGGAGAAGCTGGCAGAGAACAATATCAACCTCATTTTTG
CTGTGACGAAGAACCACTATATGCTCTACAAGAATTTTACAGCCCTGATACCTGGAACCACTGTGGAGATTTTGCA
TGGAGATTCCAAAAATATTATTCAACTGATTATCAATGCGTACAGTAGCATCCGGGCTAAAGTGGAGCTGTCAGTG
TGGGATCAGCCAGAAGACCTTAATCTCTTCTTCACTGCCACCTGCCAAGATGGCATATCTTACCCTGGTCAGAGGA
AGTGTGAGGGTCTGAAGATTGGGGACACGGCATCCTTTGAAGTGTCCGTGGAGGCTCGGAGCTGCCCCGGCAGACA
AGCAGCACAGTCTTTCACCTTGAGGCCCGTGGGCTTCCGGGACAGTCTGCAGGTGGAAGTCGCCTACAATTGCACA
TGCGGCTGTAGCACGGGGCTGGAGCCCAACAGTGCCAGATGCAGTGGGAATGGAACATACACCTGTGGGCTGTGCG
AGTGTGACCCCGGCTACCTGGGCACTAGGTGCGAGTGCCAGGAGGGGGAGAACCAGAGCGGGTACCAGAACCTGTG
CCGGGAGGCAGAGGGCAAGCCTCTGTGCAGCGGGCGTGGAGAGTGTAGCTGCAACCAGTGCTCCTGCTTCGAGAGT
GAGTTCGGGAGGATCTACGGACCTTTCTGCGAGTGTGACAGCTTTTCCTGTGCCAGAAACAAGGGCGTCCTATGCT
CAGGCCATGGAGAGTGTCACTGTGGAGAATGCAAATGCCACGCAGGTTACATTGGGGACAATTGTAACTGCTCAAC
AGACGTCAGCACATGCAAGGCCAAGGATGGGCAGATCTGCAGTGACCGAGGCCGTTGTGTCTGTGGGCAGTGCCAG
TGCACAGAGCCTGGAGCCTTTGGGGAGACGTGTGAGAAGTGCCCAACCTGCCCGGATGCTTGCAGCTCTAAGAGAG
ACTGTGTCGAATGCTTGCTACTTCACCAGGGGAAACCTGACAACCAGACCTGCCACCACCAGTGCAAAGATGAGGT
GATCACGTGGGTAGACACCATCGTCAAAGATGACCAGGAGGCTGTGCTTTGCTTCTACAAAACTGCTAAGGACTGC
GTTATGATGTTCAGCTACACAGAACTGCCCAATGGGAGGTCCAACTTGACGGTCCTCCGGGAGCCAGAATGTGGAA
GTGCCCCCAATGCCATGACCATCCTGCTGGCTGTGGTTGGCAGCATCCTCCTGATTGGGATGGCACTCCTGGCCAT
CTGGAAGCTGCTCGTCACCATCCACGACCGCCGAGAGTTTGCCAAGTTCCAAAGCCTCAAACCCCCTGTACAGAAA
GCCCATCTCCACACACACTGTCGATTTCGCCTTCAACAAGTTCAACAAATCCTACAATGGCTCAGTGGACTGAGGC
TCCTGGATGGCTGGAGGGGGACTAAGGATGAAGACTCTGGCGTGCCTTGGACTTCCTGGACCATTTGCTCACGCTA
GCTAGGCACGCACGGATAATGGAGATGCCCTCCATTGAGCCCTAAGGGACCTGGTAGCCACACAGCGGGCCACAGG
CACTTGGGGCCACTTCCCTCCAAGCCAGGGAAAGCAAGGAGACTCTGGTGTTCTCAGCTTCCCCTCTGCCGCCTCC
AGCTTGCTGTCTCCATGAACCTCTGAAGGCCTGGCTGCCCTCTTCCCTGCTGGGCCAGACAAGAAGGTATCCGGAA
GAGTCTGTGTGTACAAAGCTAGCGCGCAGCCTGGCTTTTTCCAGTTGATCGTTTTTTTTTCTATGAAATAAAAAGG
TCACGCATTTAAAAAAAAAAAAAAAA</IUPACna>
              </Seq-data_iupacna>
            </Seq-data>
          </Seq-inst_seq-data>
        </Seq-inst>
      </Bioseq_inst>
    </Bioseq>
  </Seq-entry_seq>
</Seq-entry>

_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l