[Bioperl-l] From PDB to get nucleotide sequences for all related
genomes?
Jason Stajich
jason.stajich at duke.edu
Tue Oct 12 13:51:12 EDT 2004
Well given the protein's accession you can get the protein record
(Bio::DB::GenPept)
parse the record looking for a CDS feature and grab the 'coded_by'
section. This gives you another accession number for the CDS sequence,
use Bio::DB::GenBank to get that record.
Partial examples of stuff like this are in my tutorials
http://jason.open-bio.org/Bioperl_Tutorials/
For example this one which does some work to get the CDS for a protein
based on a swissprot ID
http://jason.open-bio.org/Bioperl_Tutorials/Duke_2004/
BioperlProjects.pdf
My recollection is we've answered some or all of these questions on the
mailing list several times - Something like this gets you started
though.
http://www.google.com/search?q=site:
bioperl.org+%2Bpipermail+%2Bbioperl-l+%2BCDS&ie=UTF-8&oe=UTF-8
We really need people to volunteer to help bioperl by writing up these
questions and their solutions in the FAQ or in stand alone HOWTOs.
(This is hopefully the part where those who don't feel qualified as
"gurus" but want to help should be raising your hands...)
-jason
On Oct 12, 2004, at 10:03 AM, 최상철 wrote:
> Dear Bioperl Guru:
>
> I'm Sang Chul Choi, a graduate student in the program of
> bioinformatics at NCSU.
> I'm interested in Protein Evolution Modeling and recently I should
> apply a
> model to all PDB entries. The problem is that I am stuck in getting
> nucleotide
> sequences of all related genomes for each PDB entry.
>
> There is "DBREF" section in PDB like this:
> ./pdb1t7s.ent
> DBREF 1T7S A 74 210 GB 17507755 NP_491893 74 210
> DBREF 1T7S B 74 210 GB 17507755 NP_491893 74 210
> =====================================================
> ./pdb1t9f.ent
> DBREF 1T9F A 22 206 GB 17508635 NP_491320 22 206
> =====================================================
> ./pdb1tc3.ent
> DBREF 1TC3 A 1 21 PDB 1TC3 1TC3 1 21
> DBREF 1TC3 B 101 120 PDB 1TC3 1TC3 101 120
> DBREF 1TC3 C 202 252 GB 1086778 P34257 2 52
>
> And, I know that there is the source orgarnism section.
>
> Using these two kinds of information, I have tried to get nucleotide
> sequences
> from Database: NCBI, SWISSPROT, ...
>
> Is there any good suggestion for this thing? Any comment will be
> helpful.
>
> Thanks,
>
> Sang Chul_______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
Jason Stajich
jason.stajich at duke.edu
http://www.duke.edu/~jes12/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 2621 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20041012/756ff84a/attachment.bin
More information about the Bioperl-l
mailing list