[Bioperl-l] xml sequence download from ncbi

Geer, Lewis (NLM) lewisg@mail.nih.gov
Thu, 24 Aug 2000 10:40:01 -0400


Hi, David,

I'll ask that the first four bugs you list are fixed.  The last bug is a bit
problematic, as some of the XML parsers don't seem to understand url's and
will only load DTDs via the native filesystem.   It's unclear whether this
is a permanent problem/feature, but we'd rather not force anyone to rewrite
records.

I'll see about putting the dtd's on the web site somewhere for download,
though.

Lewis

> -----Original Message-----
> From: Lapointe, David [mailto:David.Lapointe@umassmed.edu]
> Sent: Thursday, August 24, 2000 10:22 AM
> To: Geer, Lewis (NLM)
> Cc: 'bioperl-l@bioperl.org'
> Subject: RE: [Bioperl-l] xml sequence download from ncbi
> 
> 
> Lewis,
> 
> Great stuff!!  Two things I  had a problem with. First IE5 wanted to
> download to viewer.cgi so I wonder if the mime type is not 
> set ( is there an
> xml mime type? hmmm?). I downloaded anyway
> and the file ( entrez.xml) had some errors.
> 
> Here are the first few lines as returned
> <--?xml version="1.0"?>
> <!DOCTYPE Seq---entry PUBLIC "-//NCBI//NCBI Seqset/EN" 
> "NCBI_Seqset.dtd">
> <Seq-entry>
> 
> The first line should be
> <?xml version="1.0"?>
> 
> In the second line there are two many '-' in Seq---entry, 
> which should be
> <!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN" 
> "NCBI_Seqset.dtd">
> 
> to match the root element
> <Seq-entry>
> 
> Also I had a problem resolving "NCBI_Seqset.dtd" . Shouldn't 
> there be a
> DTD-URL something like
> 
> <!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN"  
>                  "http://www.ncbi.nlm.nih.gov/../NCBI_Seqset.dtd">
> 
> /../ being some appropriate path.
> 
> 
> > -----Original Message-----
> > From: Geer, Lewis (NLM) [mailto:lewisg@mail.nih.gov]
> > Sent: Thursday, August 24, 2000 9:08 AM
> > To: Bioperl
> > Subject: [Bioperl-l] xml sequence download from ncbi
> > 
> > 
> > Hi, 
> > 
> > Sequence download using an xml format derived from our asn.1 
> > standard format
> > is now available from Entrez.  For an example, try
> >
> http://www.ncbi.nlm.nih.gov/entrez/viewer.cgi?cmd&save=on&view
=xml&val=18279
15 
 where val is the sequence gi number.  Note that this xml output is based
on our asn.1 records which are both complete and complex -- we may end up
making a genbank flatfile-like version, especially since there are small
mismatches between the asn.1 and xml languages that make the xml a bit more
complex than if xml was our native format.

We'd be interested in seeing comments!

Lewis
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l