[Biopython-dev] [Biopython - Bug #3315] Bio.SwissProt fails parsing .dat dumps

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Mon Dec 12 14:57:21 UTC 2011


Issue #3315 has been updated by Peter Cock.


The database problem is visible at http://www.uniprot.org/uniprot/C6KIH8.txt where the line is just: RX   DOI=DOI;

You said you'd reported this record (C6KIH8_AURAN) to SwissProt/UniPort, and other problems in the past, so this is a recurrent problem.

Regarding the proposed fix, not really, we need to use the warnings module rather than a print statement.

I'm looking at it, but have to download the latest uniprot_trembl.dat first (last month's was fine, so it uniprot_sprot.dat this month and last month).
----------------------------------------
Bug #3315: Bio.SwissProt fails parsing .dat dumps
https://redmine.open-bio.org/issues/3315

Author: Leszek Pryszcz
Status: New
Priority: Normal
Assignee: Biopython Dev Mailing List
Category: Main Distribution
Target version: 
URL: 


SwissProt module fails when parsing .dat dump of Uniprot_trembl vesion 201111.
The error is due to corrupted RX lines in .dat for Aureococcus anophagefferens (i.e. C6KIH8_AURAN):
> RX   DOI=DOI; 10.1111/j.1529-8817.2010.00841.x;

I have reported the problem. The thing is, that it happened before. Previously, I have reported similar issue in releases 201010, 201011, 201012...
> RX   DOI=10.1098/rspb= .2010.1301;

Will it be possible to alter error catching mechanisms in Bio.SwissProt._read_rx, so the module warns about corrupted entry instead of failing the parser?


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org




More information about the Biopython-dev mailing list