[Bioperl-l] Species name validation problem
David Waner
dwaner at scitegic.com
Sat Mar 25 00:19:04 UTC 2006
I have found that Bio::Seq->new() throws exceptions on some "species"
names containing special characters, or consisting of a single letter,
e.g:
SwissProt: POLN_ONNVG O'nyong-nyong virus
SwissProt: FIBP_ADE1H Human adenovirus 15/H9
SwissProt: POLG_FMDVZ Foot-and-mouth disease virus (strain
A22/550 Azerbaijan 65)
SwissProt: RIR1_BHV1C Bovine herpesvirus 1.1
SwissProt: SODF_METJ Methylomonas J
GenBank: AJ416726 Stylosanthes aff. calcicola
It seems that the regex in validate_species_name() is too restrictive,
but I can't find a way to turn off validation without editing bioperl
modules. There has been some recent discussion of this issue on the
mailing list (see below). Does anyone know if or when a
-validate_species option to Bio::Seq->new() will be added? Or should I
just propose the code change?
Thanks,
David Waner
> Stefan Kirov skirov at utk.edu
> Wed Sep 21 08:46:05 EDT 2005
>
>
------------------------------------------------------------------------
--------
>
> Thanks for the great answer Hilmar!
> I would prefer to have some kind of a check if the user wishes so. For
> example Entrezgene file contains some HTML tags in some entries
species
> names which is good to know.
> I will put an option -validate_species in the constructor to turn the
> check on and off. Maybe a species filter can be of some use as well.
> though you can just select the correct file from the NCBI site....
> Thanks again!
> Stefan
>
More information about the Bioperl-l
mailing list