[Bioperl-l] How sequence fetching should fail?

Heikki Lehvaslaiho heikki at ebi.ac.uk
Mon Apr 5 07:55:00 EDT 2004


Last week Web Barris asked more questions about sequence retrieval.
I had a look how different modules work when the retrieval fails due to 
nonexisting id. The response can be summarised as follows:

Bio::DB::BioFetch    WARNING
Bio::DB::GenBank     WARNING
Bio::DB::GenPept     WARNING
Bio::DB::SwissProt   EXCEPTION
Bio::DB::RefSeq      WARNING
Bio::DB::EMBL        EXCEPTION

I suggest that we treat this situation as an error that needs to be fixed in 
both development cvs head and in the 1.4 branch. All modules should print a 
warning (rather than die on an error) and return undef when retieval fails. 
It is then up to the use to test the if the sequence variable got assingned. 
This is the functionality defined in the OBDA (Open Data Base Access) specs 
and implemeted in Bio::DB::BioFetch.


The use code will always look something like this:

$db = new Bio::DB::SeqRetrievalClass;
for (@ids) {
	$seq = $gb->get_Seq_by_id($_);
	if ($seq) {
		# do what you wanted 
	} else {
		# skip and keep log
	}
}

Unless I hear any strong differing opinions within a day or two, I'll commit 
the necessary changes. The critical question here is: will this break any 
existing code?


	-Heikki


-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho    heikki_at_ebi ac uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________


More information about the Bioperl-l mailing list