[Biopython] Any CATHDB users (protein domain database)?

Wed Jun 21 13:38:34 UTC 2017

Le 21/06/2017 à 15:13, Peter Cock a écrit :
> Thanks Stéphane,
> 
> That would be much appreciated - and thank you for your patience Saket,
> 
> Peter

OK, so far it seems to work (small how-to below), here is my small test:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/bin/bash

# http://mailman.open-bio.org/pipermail/biopython-dev/2017-May/021706.html

git clone https://github.com/biopython/biopython.git
cd biopython
git fetch origin pull/1258/head
git checkout -b pullrequest FETCH_HEAD
sudo python setup.py install --prefix=/opt/biopython-dev

export PYTHONPATH=/opt/biopython-dev

python
import sys
sys.path.append('/opt/biopython-dev/lib/python2.7/site-packages/')

# Simple example for Bcl-XL
# Uniprot: http://www.uniprot.org/uniprot/Q07817
# CATH: http://www.cathdb.info/version/v4_1_0/superfamily/1.10.437.10
# PFAM: http://pfam.xfam.org/protein/Q07817
# Inhibitors: https://en.wikipedia.org/wiki/Bcl-2#Targeted_therapies
# Family: https://bcl2db.ibcp.fr/BCL2DB/

from Bio.Seq import Seq
my_seq = 
Seq("MSQSNRELVVDFLSYKLSQKGYSWSQFSDVEENRTEAPEGTESEMETPSAINGNPSWHLADSPAVNGATGHSSSLDAREVIPMAAVKQALREAGDEFELRYRRAFSDLTSQLHITPGTAYQSFEQVVNELFRDGVNWGRIVAFFSFGGALCVESVDKEMQVLVSRIAAWMATYLNDHLEPWIQENGGWDTFVELYGNNAAAESRKGQERFNRWFLTGMTVAGVVLLGSLFSRK")

from Bio.cathdb import *
q=search_by_sequence(my_seq)
check_progress(q)
{u'message': u'done', u'data': {u'status': u'done', u'date_started': 
u'2017-06-21T13:09:00', u'date_completed': u'2017-06-21T13:09:03', 
u'worker_hostname': u'mothra.biochem.ucl.ac.uk', u'id': 
u'50a6e3fc11a0f917023e43f8c86c2c75'}, u'success': 1}
r=retrieve_results(q)

print r['cath_version']
4.1.0
  print r['query_fasta']
 >QUERY
MSQSNRELVVDFLSYKLSQKGYSWSQFSDVEENRTEAPEGTESEMETPSAINGNPSWHLADSPAVNGATGHSSSLDAREVIPMAAVKQALREAGDEFELRYRRAFSDLTSQLHITPGTAYQSFEQVVNELFRDGVNWGRIVAFFSFGGALCVESVDKEMQVLVSRIAAWMATYLNDHLEPWIQENGGWDTFVELYGNNAAAESRKGQERFNRWFLTGMTVAGVVLLGSLFSRK

 >>> print r['funfam_scan']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I can provide more input, is it better on the pull request?

Best,

Stéphane

-- 
Assistant Professor in BioInformatics, UFIP, UMR 6286 CNRS, Team Protein 
Design In Silico
UFR Sciences et Techniques, 2, rue de la Houssinière, Bât. 25, 44322 
Nantes cedex 03, France
Tél : +33 251 125 636 / Fax : +33 251 125 632
http://www.ufip.univ-nantes.fr/ - http://www.steletch.org