[Bioperl-l] Bio/SeqIO/genbank.pm patch
Erik
er at xs4all.nl
Thu Nov 16 21:01:12 UTC 2006
Hi all,
Using bioperl-live, I noticed a problem with the parsing in
Bio/SeqIO/genbank.pm.
It occurs in the DBSOURCE section, where the 'dblink' annotation gets its
values. I got several values that had a double colon, like
InterPro::IPR011000 etc. Not all 'dblink' values were affected.
Here is a patch which seems to fix it / it works for me:
=======
--- Bio/SeqIO/genbank.pm.orig 2006-11-16 18:33:30.060417520 +0100
+++ Bio/SeqIO/genbank.pm 2006-11-16 20:29:59.014934936 +0100
@@ -504,7 +504,7 @@
my $db;
# this is because GenBank dropped the spaces!!!
# I'm sure we're not going to get this right
- if( $id =~
s/^(EchoBASE|IntAct|SWISS-2DPAGE|ECO2DBASE|ECOGENE|TIGRFAMs|TIGR|GO|InterPro|Pfam|PROSITE|SGD|GermOnline|HSSP|PhosSite)//i
) {
+ if( $id =~
s/^(EchoBASE|IntAct|SWISS-2DPAGE|ECO2DBASE|ECOGENE|TIGRFAMs|TIGR|GO|InterPro|Pfam|PROSITE|SGD|GermOnline|HSSP|PhosSite)://i
) {
$db = $1;
}
$annotation->add_Annotation=======
I also wrote a few tests for the problem, which also needed an extra file
in t/data.
I will attach the lot
hth,
Erik
-------------- next part --------------
A non-text attachment was scrubbed...
Name: P35527.gb
Type: application/octet-stream
Size: 14346 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0012.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genbank.t.diff
Type: application/octet-stream
Size: 2562 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0013.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: genbank.pm.diff
Type: application/octet-stream
Size: 608 bytes
Desc: not available
URL: <http://lists.open-bio.org/pipermail/bioperl-l/attachments/20061116/de8ee1cd/attachment-0014.obj>
More information about the Bioperl-l
mailing list