[Biopython] Dealing with Non-RefSeq IDs / InParanoid

Peter biopython at maubp.freeserve.co.uk
Sun Jun 21 10:34:37 UTC 2009


On Sun, Jun 21, 2009 at 2:54 AM, Matthew Strand<stran104 at chapman.edu> wrote:
> I have 3 questions:
> 1. Has anyone had success using BioPython with InParanoid? Perhaps someone
> has a nice wrapper class to share? :-)

I haven't, sorry.

> 2. Can you convert from RefSeq --> Publishing database ID (FlyBase,
> WormBase, Ensembl). Sometimes the original ID is avaliable in the /db_xref
> section of an Entrez report, but not always.

I would have a read of the NCBI Entrez documentation, as I suspect the
this might let you map from their ID to external IDs.

> 3. Is there a way to retreive a sequence given an ID from the original
> database without writing wrappers for every database?
> (e.g. WormBase CE23997, FlyBase FBpp0149695, Ensembl
> ENSCINP00000014675)

Find an online meta-database to do this for you? Places like EMBL
and the NCBI are used to this kind of cross linking...

I have found NCBI Entrez EFetch understands several other identifiers
(e.g. SwissProt/UnitProt IDs), but not all (as I recall it didn't seem to
like expired SwissProt/UniProt IDs, but going to the SwissProt/UnitProt
website manually you can find out the new replacement ID).

Peter



More information about the Biopython mailing list