[Bioperl-l] multiple species in embl
Heikki Lehvaslaiho
heikki at ebi.ac.uk
Tue Jul 13 09:34:12 EDT 2004
Laurie,
By two species, do you mean hybrid animals? That is the only case where there
should be more than one species in EMBL enties:
http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html#3.4.7
Even in that case the OC line is there only for the first species.
I am not guite sure what bioperl should return in that case. Returning two
species objects sounds a bt excessive when the second one is not fully
populated ...
It is a long known problem that SWISS-PROT format allows multiple species per
entry. Bioperl has been taking in only one; the first, I think.
Could send us some EMBL accession numbers with two species, please, so that we
could have a look.
-Heikki
P;S. These kind of long bug reports and file attachments go best into bioperl
bugzilla: http://bugzilla.open-bio.org/. They are easier to manage there.
Thanks,
-H
On Monday 12 Jul 2004 18:00, Laure.Durufle at serono.com wrote:
> Hi,
>
> I noticed something : in the the package embl.pm, the method species
> returns only the last organism : but in embl, one entry can belong to 2
> organisms.
> I write a method get_species to obtain all organisms in RichSeq.pm and in
> embl.pm, we add push @{$params{'-species'}},$species ; instead
> $params{'-species'} = $species ;
>
> # $Id: RichSeq.pm,v 1.9 2002/11/11 18:16:31 lapp Exp $
> #
> # BioPerl module for Bio::Seq::RichSeq
> #
> # Cared for by Ewan Birney <birney at ebi.ac.uk>
> #
> # Copyright Ewan Birney
> #
> # You may distribute this module under the same terms as perl itself
>
> # POD documentation - main docs before the code
>
> =head1 NAME
>
> Bio::Seq::RichSeq - Module implementing a sequence created from a rich
> sequence database entry
>
> =head1 SYNOPSIS
>
> See Bio::Seq::RichSeqI and documentation of methods.
>
> =head1 DESCRIPTION
>
> This module implements Bio::Seq::RichSeqI, an interface for sequences
> created from or created for entries from/of rich sequence databanks,
> like EMBL, GenBank, and SwissProt. Methods added to the Bio::SeqI
> interface therefore focus on databank-specific information. Note that
> not every rich databank format may use all of the properties provided.
>
> =head1 Implemented Interfaces
>
> This class implementes the following interfaces.
>
> =over 4
>
> =item Bio::Seq::RichSeqI
>
> Note that this includes implementing Bio::PrimarySeqI and Bio::SeqI.
>
> =item Bio::IdentifiableI
>
> =item Bio::DescribableI
>
> =item Bio::AnnotatableI
>
> =back
>
> =head1 FEEDBACK
>
> =head2 Mailing Lists
>
> User feedback is an integral part of the evolution of this
> and other Bioperl modules. Send your comments and suggestions preferably
> to one of the Bioperl mailing lists.
> Your participation is much appreciated.
>
> bioperl-l at bioperl.org - General discussion
> http://bio.perl.org/MailList.html - About the mailing lists
>
> =head2 Reporting Bugs
>
> Report bugs to the Bioperl bug tracking system to help us keep track
> the bugs and their resolution.
> Bug reports can be submitted via email or the web:
>
> bioperl-bugs at bio.perl.org
> http://bugzilla.bioperl.org/
>
> =head1 AUTHOR - Ewan Birney
>
> Email birney at ebi.ac.uk
>
> Describe contact details here
>
> =head1 APPENDIX
>
> The rest of the documentation details each of the object methods. Internal
> methods are usually preceded with a _
>
> =cut
>
>
> # Let the code begin...
>
>
> package Bio::Seq::RichSeq;
> use vars qw($AUTOLOAD @ISA);
> use strict;
>
> # Object preamble - inherits from Bio::Root::Object
>
> use Bio::Seq;
> use Bio::Seq::RichSeqI;
> use Data::Denter;
>
> @ISA = qw(Bio::Seq Bio::Seq::RichSeqI);
>
>
> =head2 new
>
> Title : new
> Usage : $seq = Bio::Seq::RichSeq->new( -seq => 'ATGGGGGTGGTGGTACCCT',
> -id => 'human_id',
> -accession_number => 'AL000012',
> );
>
> Function: Returns a new seq object from
> basic constructors, being a string for the sequence
> and strings for id and accession_number
> Returns : a new Bio::Seq::RichSeq object
>
> =cut
>
> sub new {
> # standard new call..
> my($caller, at args) = @_;
> my $self = $caller->SUPER::new(@args);
>
> $self->{'_dates'} = [];
> $self->{'_secondary_accession'} = [];
> $self->{'_species'} = [];
>
> my ($dates, $xtra, $sv,
> $keywords, $pid, $mol,
> $division,$species ) = $self->_rearrange([qw(DATES
> SECONDARY_ACCESSIONS
> SEQ_VERSION
> KEYWORDS
> PID
> MOLECULE
> DIVISION
> SPECIES
> )],
> @args);
> defined $division && $self->division($division);
> defined $mol && $self->molecule($mol);
> defined $keywords && $self->keywords($keywords);
> defined $sv && $self->seq_version($sv);
> defined $pid && $self->pid($pid);
> #defined $pid && $self->species($pid);
>
> if( defined $dates ) {
> if( ref($dates) =~ /array/i ) {
> foreach ( @$dates) {
> $self->add_date($_);
> }
> } else {
> $self->add_date($dates);
> }
> }
>
> if( defined $species ) {
> if( ref($species) =~ /array/i ) {
> foreach ( @$species) {
> $self->add_species($_);
> }
> } else {
> $self->add_species($species);
> }
> }
>
>
> if( defined $xtra ) {
> if( ref($xtra) =~ /array/i ) {
> foreach ( @$xtra) {
> $self->add_secondary_accession($_);
> }
> } else {
> $self->add_secondary_accession($xtra);
> }
> }
>
> return $self;
> }
>
>
> =head2 division
>
> Title : division
> Usage : $obj->division($newval)
> Function:
> Returns : value of division
> Args : newvalue (optional)
>
>
> =cut
>
> sub division {
> my $obj = shift;
> if( @_ ) {
> my $value = shift;
> $obj->{'_division'} = $value;
> }
> return $obj->{'_division'};
>
> }
>
> =head2 molecule
>
> Title : molecule
> Usage : $obj->molecule($newval)
> Function:
> Returns : type of molecule (DNA, mRNA)
> Args : newvalue (optional)
>
>
> =cut
>
> sub molecule {
> my $obj = shift;
> if( @_ ) {
> my $value = shift;
> $obj->{'_molecule'} = $value;
> }
> return $obj->{'_molecule'};
>
> }
>
>
> =head2 add_species
>
> Title : add_species
> Usage : $self->add_species($species)
> Function: adds a species
> Example :
> Returns : an array of such strings
> Args :
>
>
> =cut
>
> sub add_species {
> my ($self, at species) = @_;
> foreach my $dt ( @species ) {
> push(@{$self->{'_species'}},$dt);
> }
> }
>
> =head2 get_species
>
> Title : get_species
> Usage :
> Function:
> Example :
> Returns : an array of strings
> Args :
>
>
> =cut
>
> sub get_species{
> my ($self) = @_;
> return @{$self->{'_species'}};
> }
>
>
> =head2 add_date
>
> Title : add_date
> Usage : $self->add_date($datestr)
> Function: adds a date
> Example :
> Returns : a date string or an array of such strings
> Args :
>
>
> =cut
>
>
>
> sub add_date {
> my ($self, at dates) = @_;
> foreach my $dt ( @dates ) {
> push(@{$self->{'_dates'}},$dt);
> }
> }
>
> =head2 get_dates
>
> Title : get_dates
> Usage :
> Function:
> Example :
> Returns : an array of date strings
> Args :
>
>
> =cut
>
> sub get_dates{
> my ($self) = @_;
> return @{$self->{'_dates'}};
> }
>
>
> =head2 pid
>
> Title : pid
> Usage :
> Function: Get (and set, depending on the implementation) the PID property
> for the sequence.
> Example :
> Returns : a string
> Args :
>
>
> =cut
>
> sub pid {
> my ($self,$pid) = @_;
>
> if(defined($pid)) {
> $self->{'_pid'} = $pid;
> }
> return $self->{'_pid'};
> }
>
>
> =head2 accession
>
> Title : accession
> Usage : $obj->accession($newval)
> Function: Whilst the underlying sequence object does not
> have an accession, so we need one here.
>
> In this implementation this is merely a synonym for
> accession_number().
> Example :
> Returns : value of accession
> Args : newvalue (optional)
>
>
> =cut
>
> sub accession {
> my ($obj, at args) = @_;
> return $obj->accession_number(@args);
> }
>
> =head2 add_secondary_accession
>
> Title : add_secondary_accession
> Usage : $self->add_domment($ref)
> Function: adds a secondary_accession
> Example :
> Returns :
> Args : a string or an array of strings
>
>
> =cut
>
> sub add_secondary_accession {
> my ($self) = shift;
> foreach my $dt ( @_ ) {
> push(@{$self->{'_secondary_accession'}},$dt);
> }
> }
>
> =head2 get_secondary_accessions
>
> Title : get_secondary_accessions
> Usage :
> Function:
> Example :
> Returns : An array of strings
> Args :
>
>
> =cut
>
> sub get_secondary_accessions{
> my ($self, at args) = @_;
> return @{$self->{'_secondary_accession'}};
> }
>
> =head2 seq_version
>
> Title : seq_version
> Usage : $obj->seq_version($newval)
> Function:
> Example :
> Returns : value of seq_version
> Args : newvalue (optional)
>
>
> =cut
>
> sub seq_version{
> my ($obj,$value) = @_;
> if( defined $value) {
> $obj->{'_seq_version'} = $value;
> }
> return $obj->{'_seq_version'};
>
> }
>
>
> =head2 keywords
>
> Title : keywords
> Usage : $obj->keywords($newval)
> Function:
> Returns : value of keywords (a string)
> Args : newvalue (optional) (a string)
>
>
> =cut
>
> sub keywords {
> my $obj = shift;
> if( @_ ) {
> my $value = shift;
> $obj->{'_keywords'} = $value;
> }
> return $obj->{'_keywords'};
>
> }
>
> #
> ##
> ### Deprecated methods kept for ease of transtion
> ##
> #
>
> sub each_date {
> my ($self) = @_;
> $self->warn("Deprecated method... please use get_dates");
> return $self->get_dates;
> }
>
>
> sub each_secondary_accession {
> my ($self) = @_;
> $self->warn("each_secondary_accession - deprecated method. use
> get_secondary_accessions");
> return $self->get_secondary_accessions;
>
> }
>
> sub sv {
> my ($obj,$value) = @_;
> $obj->warn("sv - deprecated method. use seq_version");
> $obj->seq_version($value);
> }
>
>
> 1;
>
>
>
>
> Best regards
>
> Laure Durufle
>
>
>
>
> ***************************************************************************
>***************** S - This message contains confidential information and is
> intended only for the individual named. If you are not the named addressee,
> you should not disseminate, distribute or copy this e-mail. Please notify
> the sender immediately by e-mail if you have received this e-mail by
> mistake and delete this e-mail from your system.
> e-mail transmission cannot be guaranteed to be secure or error-free as
> information could be intercepted, corrupted, lost, destroyed, arrive late
> or incomplete, or contain malware. The presence of this disclaimer is not a
> proof that it was originated at Serono International S.A. or one of its
> affiliates. Serono International S.A and its affiliates therefore do not
> accept liability for any errors or omissions in the content of this
> message, which arise as a result of e-mail transmission. If verification is
> required, please request a hard-copy version. Serono International SA,
> 15bis Chemin Des Mines, Geneva, Switzerland, www.serono.com.
> ***************************************************************************
>******************
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki at_ebi _ac _uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambridge, CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________
More information about the Bioperl-l
mailing list