[Bioperl-l] Swiss-prot species cont.

Karger, Amir AKarger@CuraGen.com
Thu, 12 Jul 2001 16:56:04 -0400


OK. Feeling masochistic, I ran "grep '^OS' seq.dat | sort -u" on Swiss-prot.
In case my last email didn't give you enough to worry about, there's also:

(1) OS   and Notothenia angustata (Rockcod)
(2) OS   snake).

If I'm not mistaken:
(1) yields a genus of "and" (overwriting the correct (first) genus set on
the previous OS line)
(2) yields a genus of snake (overwriting...)

Seems like at least some problems would be solved by concatentating all OS
lines once you've seen one (and then s/^OS\s+//g) and then doing the
analysis. You'll still have problems with >1 species, but I think you'll at
least just get the first species, rather than a nonsensical answer.

Amir Karger
Curagen Corporation