[Bioperl-l] swiss prot
Heikki Lehvaslaiho
heikki@ebi.ac.uk
Thu, 12 Apr 2001 16:31:55 +0100
OK. They moved the dbfetch script over to the main site here at EBI.
Unfortunately, the main page has some typos (I have a constant and
loving relationship with typos, I am sure you've noticed that.) and
some of the examples are not working properly. Easter holidays are
here and I can not get the fixes over to the main web site. Pay no
attention to them.
Have a look at :
http://www.ebi.ac.uk/cgi-bin/dbfetch
Sean's entries:
http://www.ebi.ac.uk/cgi-bin/dbfetch?db=swall&format=fasta&id=P00916,O39869
are both retrieved.
The next step is to get suggestions what databases should be added to
the script and get volunteers to design storage objects and
corresponding Bio::DB module.
At the moment we have SWALL, PDB, Medline and Ensembl available.
I've updated Bio::DB::EMBL. It is up to Jason what he does with
Bio::DB::Swissprot.
Any takers for Medline literature refence object design?
-Heikki
Heikki Lehvaslaiho wrote:
>
> I think I have solution, but it is not ready, yet.
>
> Rodrigo was teasing me to make emblfetch cgi script into a general
> dbfetch and took him seriously. 8-) The script is in testing phase
> here at EBI. It offers an easy way to access any local SRS database.
> The database specific parameters are kept in an easy to modify hash
> (has to be modified within the script for speed). I debugged it using
> EMBL, Medline (servs XML!), and Ensembl. It took me exactly one minute
> to add SWALL into it. SWALL is a weekly updated SWISS-PROT +
> SP-TrEMBL + TrEMBLnew.
>
> In a short while (week or so depending on how many bugs and feature
> changes others want to have before the release) we should be able
> point Bio::DB::Swissprot to this script. I am going to distribute the
> dbfetch script so that hopefully most SRS maintainers install it and
> people could use SRS server closest to them.
>
> -Heikki
>
> Jason Stajich wrote:
> >
> > This is a TrEMBL entry not Swiss prot. <sigh>. swiss format expects
> > ID_DIVISION in ID line. There is no real good way to determine this on
> > the fly in Bio::DB::EMBL since we pass the stream to a SeqIO object.
> >
> > [sprot] http://www.expasy.org/cgi-bin/get-sprot-raw.pl?P00916
> > [TrEMBL] http://www.expasy.org/cgi-bin/get-sprot-raw.pl?O39869
> >
> > Bioperl: here is my fix - please let me know if you think this is
> > acceptable and I'll submit the fix.
> >
> > I am assigning division to UNK for the TrEMBL entry even though we could
> > probably deduce it from OC lines - I don't want to deal with that right
> > now... (also changed ^\s to \S since they are equivalent).
> >
> > RCS file: /home/repository/bioperl/bioperl-live/Bio/SeqIO/swiss.pm,v
> > retrieving revision 1.36
> > diff -r1.36 swiss.pm
> > 153c153
> > < $line =~ /^ID\s+([^\s_]+)_([^\s_]+)\s+([^\s;]+);\s+([^\s;]+);/
> > ---
> > > $line =~ /^ID\s+([\S_]+)(_[\S_]+)?\s+([\S;]+);\s+([\S;]+);/
> > 155c155,161
> > < $name = $1."_".$2;
> > ---
> > > if( $2 ) {
> > > $name = $1."_".$2;
> > > $seq->division($2);
> > > } else {
> > > $name = $1;
> > > $seq->division('UNK');
> > > }
> > 157d162
> > < $seq->division($2);
> >
> > On Tue, 10 Apr 2001, Xiangyun Wang wrote:
> >
> > > Hi,
> > >
> > > I am using the bio::DB::siwssprot module to retrieve protein sequences
> > > with their id.
> > >
> > > But some proteins (as Q9EPU5) can't be retrieved.
> > >
> > > What's the problem here?
> > >
> > > Thanks
> > > Sean
> > >
> > > _______________________________________________
> > > Bioperl-l mailing list
> > > Bioperl-l@bioperl.org
> > > http://bioperl.org/mailman/listinfo/bioperl-l
> > >
> >
> > Jason Stajich
> > jason@chg.mc.duke.edu
> > Center for Human Genetics
> > Duke University Medical Center
> > http://www.chg.duke.edu/
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> --
> ______ _/ _/_____________________________________________________
> _/ _/ http://www.ebi.ac.uk/mutations/
> _/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
> _/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
> _/ _/ _/ Wellcome Trust Genome Campus, Hinxton
> _/ _/ _/ Cambs. CB10 1SD, United Kingdom
> _/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
> ___ _/_/_/_/_/________________________________________________________
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
--
______ _/ _/_____________________________________________________
_/ _/ http://www.ebi.ac.uk/mutations/
_/ _/ _/ Heikki Lehvaslaiho heikki@ebi.ac.uk
_/_/_/_/_/ EMBL Outstation, European Bioinformatics Institute
_/ _/ _/ Wellcome Trust Genome Campus, Hinxton
_/ _/ _/ Cambs. CB10 1SD, United Kingdom
_/ Phone: +44 (0)1223 494 644 FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________