[Bioperl-l] Species name validation problem
David Waner
dwaner at scitegic.com
Mon Mar 27 18:24:12 UTC 2006
Yes, I meant to type Bio::Species, not Bio::Seq. Sorry for the
confusion.
My problem is that I am not calling $species->classification() directly;
I am calling Bio::Species->new(), which in turn calls classification()
which calls validate_species_name(), which then throws an exception on
some species names. As far as I can see, there is no way to turn off
this (over-aggressive) validation in the Species constructor.
I guess that instead of this:
$species = Bio::Species->new(-classification =>
\@classificationArray);
I could do this:
$species = Bio::Species->new();
$species->classification(\@classificationArray, 'no
validation');
but it would make a nicer interface to have a validation option in the
Species constructor.
- David
-----Original Message-----
From: Hilmar Lapp [mailto:hlapp at gmx.net]
Sent: Friday, March 24, 2006 9:42 PM
To: David Waner
Cc: Bioperl-l at lists.open-bio.org
Subject: Re: [Bioperl-l] Species name validation problem
The option would be in Bio::Species, not Bio::Seq. You can circumvent
the name validation by passing an array ref to
$species->classification() and anything that evaluates to true as the
second argument. This is for instance what the genbank parser does
(which doesn't mean that it is always correct); supposedly the swissprot
parser ought to do the same.
-hilmar
On 3/24/06, David Waner <dwaner at scitegic.com> wrote:
> I have found that Bio::Seq->new() throws exceptions on some "species"
> names containing special characters, or consisting of a single letter,
> e.g:
>
> SwissProt: POLN_ONNVG O'nyong-nyong virus
> SwissProt: FIBP_ADE1H Human adenovirus 15/H9
> SwissProt: POLG_FMDVZ Foot-and-mouth disease virus (strain
> A22/550 Azerbaijan 65)
> SwissProt: RIR1_BHV1C Bovine herpesvirus 1.1
> SwissProt: SODF_METJ Methylomonas J
> GenBank: AJ416726 Stylosanthes aff. calcicola
>
> It seems that the regex in validate_species_name() is too restrictive,
> but I can't find a way to turn off validation without editing bioperl
> modules. There has been some recent discussion of this issue on the
> mailing list (see below). Does anyone know if or when a
> -validate_species option to Bio::Seq->new() will be added? Or should I
> just propose the code change?
>
> Thanks,
> David Waner
>
>
> > Stefan Kirov skirov at utk.edu
> > Wed Sep 21 08:46:05 EDT 2005
> >
> >
> ----------------------------------------------------------------------
> --
> --------
> >
> > Thanks for the great answer Hilmar!
> > I would prefer to have some kind of a check if the user wishes so.
> > For
>
> > example Entrezgene file contains some HTML tags in some entries
> species
> > names which is good to know.
> > I will put an option -validate_species in the constructor to turn
> > the check on and off. Maybe a species filter can be of some use as
> > well. though you can just select the correct file from the NCBI
> > site.... Thanks again! Stefan
> >
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>
--
----------------------------------------------------------
: Hilmar Lapp -:- San Diego, CA -:- hlapp at gmx dot net :
----------------------------------------------------------
_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l
--
Click on the link below to report this email as spam
https://www.mailcontrol.com/sr/6RxreR3!4EAT093Sa0o+kL74sPfAD2rj2Jp!eGk8r
RtXfcIn+KX87A70BrDI0qIcMansH9FDdvd7u5Zc1G6CuaLdquPg4xnr+tcULmTIZgnhNIFUk
MNJWsODXSRTEtZF6To1umzAv!mlBBYJW4WXOZWaK8xzZrmj3Eao8o3D4YNM7jMpLnqnc7LtK
9D9H+YhmDk7r9DMVd5h6cTMU3rPx7Z43oVxeMeC
More information about the Bioperl-l
mailing list