[Bioperl-l] xml sequence download from ncbi
Geer, Lewis (NLM)
lewisg@mail.nih.gov
Thu, 24 Aug 2000 10:40:01 -0400
Hi, David,
I'll ask that the first four bugs you list are fixed. The last bug is a bit
problematic, as some of the XML parsers don't seem to understand url's and
will only load DTDs via the native filesystem. It's unclear whether this
is a permanent problem/feature, but we'd rather not force anyone to rewrite
records.
I'll see about putting the dtd's on the web site somewhere for download,
though.
Lewis
> -----Original Message-----
> From: Lapointe, David [mailto:David.Lapointe@umassmed.edu]
> Sent: Thursday, August 24, 2000 10:22 AM
> To: Geer, Lewis (NLM)
> Cc: 'bioperl-l@bioperl.org'
> Subject: RE: [Bioperl-l] xml sequence download from ncbi
>
>
> Lewis,
>
> Great stuff!! Two things I had a problem with. First IE5 wanted to
> download to viewer.cgi so I wonder if the mime type is not
> set ( is there an
> xml mime type? hmmm?). I downloaded anyway
> and the file ( entrez.xml) had some errors.
>
> Here are the first few lines as returned
> <--?xml version="1.0"?>
> <!DOCTYPE Seq---entry PUBLIC "-//NCBI//NCBI Seqset/EN"
> "NCBI_Seqset.dtd">
> <Seq-entry>
>
> The first line should be
> <?xml version="1.0"?>
>
> In the second line there are two many '-' in Seq---entry,
> which should be
> <!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN"
> "NCBI_Seqset.dtd">
>
> to match the root element
> <Seq-entry>
>
> Also I had a problem resolving "NCBI_Seqset.dtd" . Shouldn't
> there be a
> DTD-URL something like
>
> <!DOCTYPE Seq-entry PUBLIC "-//NCBI//NCBI Seqset/EN"
> "http://www.ncbi.nlm.nih.gov/../NCBI_Seqset.dtd">
>
> /../ being some appropriate path.
>
>
> > -----Original Message-----
> > From: Geer, Lewis (NLM) [mailto:lewisg@mail.nih.gov]
> > Sent: Thursday, August 24, 2000 9:08 AM
> > To: Bioperl
> > Subject: [Bioperl-l] xml sequence download from ncbi
> >
> >
> > Hi,
> >
> > Sequence download using an xml format derived from our asn.1
> > standard format
> > is now available from Entrez. For an example, try
> >
> http://www.ncbi.nlm.nih.gov/entrez/viewer.cgi?cmd&save=on&view
=xml&val=18279
15
where val is the sequence gi number. Note that this xml output is based
on our asn.1 records which are both complete and complex -- we may end up
making a genbank flatfile-like version, especially since there are small
mismatches between the asn.1 and xml languages that make the xml a bit more
complex than if xml was our native format.
We'd be interested in seeing comments!
Lewis
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l
_______________________________________________
Bioperl-l mailing list
Bioperl-l@bioperl.org
http://bioperl.org/mailman/listinfo/bioperl-l