[Bioperl-l] Bio::SeqIO::embl problems reading swissprot clarification

Murad Nayal murad@godel.bioc.columbia.edu
Mon, 11 Dec 2000 21:43:05 +0100


It seems that I was a bit too quick to post. looking more carefully now
I see my previous message was somewhat inaccurate (sorry about that, i
am just starting to get familiar with bioperl).

to clarify:

-I used Bio::Index::EMBL to index a swissprot file and subsequently
retrieve sequences from it by id. the documentation for this class
specifies that it can be used to index both embl and swissprot files.
There is no Bio::Index::Swiss class.

-the problem i encountered arouse when I tried to retrieve sequences
using a Bio::Index::EMBL object from the previously indexed file. This
object uses a Bio::SeqIO::embl object to read into the sequence file.
this fails with multiple errors/warnings.

Apparently the thing to do is to write a Bio::Index::Swiss class (which
really can be highly similar to Bio::Index::EMBL. as far as I can see a
subclass of Bio::Index::EMBL that overrides EMBL::_file_format() to get
it to return "swiss" might be sufficient? this should lead to the usage
of Bio::SeqIO::swiss instead for input (no need to modify
Bio::SeqIO::embl).

-Has something like that been done in 0.7?

again, of course I might be missing something! I appreciate all your
comments/corrections

Regards



Murad Nayal wrote:
> 
> Hello All,
> 
> It seems that Bio::SeqIO::embl is having some problems reading
> swissprot.dat file. (bioperl 0.6.2 and swissprot 38). for example it
> does not match alternative formats for the DR record which leads it to
> not instantiate the corresponding DBLink object and occasionally crash.
> I have started fixing some of this stuff but I thought I'd check with
> the list first.
> 
> -is Bio::SeqIO::embl 'supposed' to be able to read swissprot? or is
> there another implementation of SeqIO to do that (which I couldn't find
> in the 0.6.2 distribution).
> 
> -Have these bugs been reported/fixed already in the 0.7 distribution.
> 
> -when will 0.7 be available? (is read access to CVS available now to
> everyone?).
> 
> PS: I should mention that I am encountering these problems when
> accessing the .dat file via an Index!
> 
> Regards
> 
> --
> Murad Nayal M.D. Ph.D.
> Department of Biochemistry and Molecular Biophysics
> College of Physicians and Surgeons of Columbia University
> 630 West 168th Street. New York, NY 10032
> Tel: 212-305-6884       Fax: 212-305-6926
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l

-- 
Murad Nayal M.D. Ph.D.
Department of Biochemistry and Molecular Biophysics
College of Physicians and Surgeons of Columbia University
630 West 168th Street. New York, NY 10032
Tel: 212-305-6884	Fax: 212-305-6926