[Biopython] Pubmeddata XML parsing with Entrez .fetch and .read

Michiel de Hoon mjldehoon at yahoo.com
Thu Jul 15 13:36:19 UTC 2010



--- On Thu, 7/15/10, Peter <biopython at maubp.freeserve.co.uk> wrote:
> This is why I was suggesting to Michiel that we override
> than seeing the __repr__ for our subclassed objects, so
> that rather things like this:
> 
> ['btp163', '10.1093/bioinformatics/btp163', '19304878',
> 'PMC2682512']
> 
> we get something like:
> 
> ListElement(['btp163', '10.1093/bioinformatics/btp163',
> '19304878', 'PMC2682512'], attributes={...})
> 
> On deeper reflection, the trouble with this is that all the
> children within the list would get longer, so the full
> representation of a ListElement (or
> any container) would become very very long - swamping the
> console output.

The attributes are almost always only a small fraction of the Entrez XML file. So while it's true that each element gets larger, it's a small relative increase. The elements that are very long after adding the attributes are also very long without the attributes. So I am in favor of your original suggestion. If there are no other suggestions, I'll make the change in Bio.Entrez over the weekend (or feel free to do so before that).

Best,
--Michiel


      



More information about the Biopython mailing list