[Bioperl-l] Scanning GenPept entries does not work

Henrik.Seidel@schering.de Henrik.Seidel@schering.de
Mon, 30 Oct 2000 18:28:12 +0100


--0__=C1256988005FF8118f9e8a93df938690918cC1256988005FF811
Content-type: text/plain; charset=us-ascii


How about this? Here is what the patch does:

1.) If it finds a LOCUS line with "aa" instead of "bp", it sets the molecule to
'protein'.

2.) GenPept-Files do not contain the "BASE" line which was used by now for
detecting the end of the FEATURES. Hence, both "BASE" _and_ "ORIGIN" are valid
"terminators" of the feature table.

What do you think? It seems to work for me.

Regards

Henrik

(See attached file: genbank.pm.diff)
--0__=C1256988005FF8118f9e8a93df938690918cC1256988005FF811
Content-type: application/octet-stream; 
	name="genbank.pm.diff"
Content-Disposition: attachment; filename="genbank.pm.diff"
Content-transfer-encoding: base64

LS0tIEJpby9TZXFJTy9nZW5iYW5rLnBtLnNhdglNb24gT2N0IDMwIDE2OjQwOjQyIDIwMDAKKysr
IEJpby9TZXFJTy9nZW5iYW5rLnBtCU1vbiBPY3QgMzAgMTc6MjA6MjkgMjAwMApAQCAtMTgyLDIz
ICsxODIsMjcgQEAKICAgICB9CiAgICAgCiAgICAgJGxpbmUgPX4gL15MT0NVU1xzK1xTKy8gfHwg
JHNlbGYtPnRocm93KCJHZW5CYW5rIHN0cmVhbSB3aXRoIG5vIExPQ1VTLiBOb3QgR2VuQmFuayBp
biBteSBib29rLiBHb3QgJGxpbmUiKTsKLSAgICAkbGluZSA9fiAvXkxPQ1VTXHMrKFxTKylccytc
UytccyticFxzKyhcUyspXHMrKFxTKylccysoXFMrKS87CisgICAgaWYgKCRsaW5lID1+IC9eTE9D
VVNccysoXFMrKVxzK1xTK1xzK2JwXHMrKFxTKylccysoXFMrKVxzKyhcUyspLykgeworICAgIAko
JG5hbWUsICRtb2wsICRkaXYsICRkYXRlKSA9ICgkMSwgJDIsICQzLCAkNCk7CisgICAgfSBlbHNp
ZiAoJGxpbmUgPX4gL15MT0NVU1xzKyhcUyspXHMrXFMrXHMrYWFccysoXFMrKVxzKyhcUyspLykg
eworICAgICAgICAoJG5hbWUsICRkaXYsICRkYXRlKSA9ICgkMSwgJDIsICQzKTsKKwkkbW9sID0g
J3Byb3RlaW4nOworICAgIH0gZWxzZSB7CisJJHNlbGYtPnRocm93KCJHZW5CYW5rIHN0cmVhbSB3
aXRoIExPQ1VTIGxpbmUgb2YgdW5rbm93biBmb3JtYXQuIEdvdCAkbGluZSIpOworICAgIH0KIAot
ICAgICRuYW1lID0gJDE7CiAgICAgIyB0aGlzIGlzIGltcG9ydGFudCB0byBoYXZlIHRoZSBpZCBm
b3IgZGlzcGxheSBpbiBlLmcuIEZUSGVscGVyLCBvdGhlcndpc2UKICAgICAjIHlvdSB3b24ndCBr
bm93IHdoaWNoIGVudHJ5IGNhdXNlZCBhbiBlcnJvcgogICAgICRzZXEtPmRpc3BsYXlfaWQoJG5h
bWUpOwotICAgICRtb2w9JDI7IAorCiAgICAgaWYgKCRtb2wpIHsKIAkkc2VxLT5tb2xlY3VsZSgk
bW9sKTsKICAgICB9CiAgICAgCi0gICAgJGRpdj0kMzsKICAgICBpZiAoJGRpdikgewogCSRzZXEt
PmRpdmlzaW9uKCRkaXYpOwogICAgIH0KICAgICAKLSAgICAkZGF0ZT0kNDsKICAgICBpZiAoJGRh
dGUpIHsKIAkkc2VxLT5hZGRfZGF0ZSgkZGF0ZSk7CiAgICAgfQpAQCAtMjc5LDYgKzI4Myw5IEBA
CiAgICAgIyBmb2xsb3dpbmcgYmxvY2sgYW5kIGxvb3AgZml4ZWQgYnkKICAgICAjIEhMIDxIaWxt
YXIuTGFwcEBwaGFybWEubm92YXJ0aXMuY29tPiwgMDUvMDUvMjAwMAogICAgICMgc2VlIGNvbW1l
bnRzCisgICAgCisgICAgIyBhZGRlZCBzdXBwb3J0IGZvciBHZW5QZXB0IGZpbGVzIGluIEdlbkJh
bmsgZm9ybWF0IGJ5CisgICAgIyBIUyA8SGVucmlrLlNlaWRlbEBzY2hlcmluZy5kZT4sIDIwMDAt
MTAtMzAKIAogICAgICRidWZmZXIgPSAkc2VsZi0+X3JlYWRsaW5lOwogICAgIEZFQVRVUkVfVEFC
TEUgOgpAQCAtMjg2LDcgKzI5Myw3IEBACiAJIyBlZmZlY3QgaW4gX3JlYWRfRlRIZWxwZXJfR2Vu
QmFuayEKIAl3aGlsZSggZGVmaW5lZCgkYnVmZmVyKSApIHsKIAkgICAgIyBjaGVjayBpbW1pZGlh
dGVseSAtLSBub3QgYXQgdGhlIGVuZCBvZiB0aGUgbG9vcAotCSAgICBsYXN0IGlmKCRidWZmZXIg
PX4gL15CQVNFLyk7CisJICAgIGxhc3QgaWYoJGJ1ZmZlciA9fiAvXig/OkJBU0V8T1JJR0lOKS8p
OwogCSAgICAjIHNsdXJwIGluIG9uZSBmZWF0dXJlIGF0IGEgdGltZSAtLSBhdCByZXR1cm4sIHRo
ZSBzdGFydCBvZgogCSAgICAjIHRoZSBuZXh0IGZlYXR1cmUgd2lsbCBoYXZlIGJlZW4gcmVhZCBh
bHJlYWR5LCBzbyB3ZSBuZWVkCiAJICAgICMgdG8gcGFzcyBhIHJlZmVyZW5jZSwgYW5kIHRoZSBj
YWxsZWQgbWV0aG9kIG11c3Qgc2V0IHRoaXMKQEAgLTI5Niw4ICszMDMsMTAgQEAKIAkgICAgJGZ0
dW5pdC0+X2dlbmVyaWNfc2VxZmVhdHVyZSgkc2VxKTsKIAl9CiAgICAgJHNlcWMgPSAiIjsJCi0g
ICAgd2hpbGUgKGRlZmluZWQoICRfID0gJHNlbGYtPl9yZWFkbGluZSkpIHsKLQlsYXN0IGlmIC9e
T1JJR0lOLzsKKyAgICB1bmxlc3MgKCRidWZmZXIgPX4gL15PUklHSU4vKSB7CisJd2hpbGUgKGRl
ZmluZWQoICRfID0gJHNlbGYtPl9yZWFkbGluZSkpIHsKKwkgICAgbGFzdCBpZiAvXk9SSUdJTi87
CisJfQogICAgIH0KIAogICAgIHdoaWxlKCBkZWZpbmVkKCRfID0gJHNlbGYtPl9yZWFkbGluZSkg
KSB7Cg==

--0__=C1256988005FF8118f9e8a93df938690918cC1256988005FF811--