[Biopython] Generating a fasta file from atomic coordinate file

Peter Cock p.j.a.cock at googlemail.com
Tue Mar 20 23:23:09 UTC 2018


That is using the 3D structure to get the protein sequence
(using the PDB parser and NumPy as a dependency), and
the code to call it can be shortened to just:

from Bio import SeqIO
SeqIO.convert("input.pdb", "pdb-atom", "output.fasta", "fasta")

Or, if you just want the sequence in the SEQRES header:

from Bio import SeqIO
SeqIO.convert("input.pdb", "pdb-seqres", "output.fasta", "fasta")

See:

http://biopython.org/wiki/SeqIO

Peter

On Tue, Mar 20, 2018 at 10:05 PM, João Rodrigues
<j.p.g.l.m.rodrigues at gmail.com> wrote:
> Hi Ahmad,
>
> You can use Bio.Seq directly on the PDB file:
>
> from Bio import SeqIO
> records = SeqIO.parse('1xyz.pdb', 'pdb-atom'):
> with open('1xyz.fasta', 'w') as handle:
>     SeqIO.write(records, handle, "fasta")
>
> Not sure if there is a way to couple SeqIO directly to the Bio.PDB code (a
> method that allows to read the sequence from the SMCRA object), that would
> be cool to add.
>
> Cheers,
>
> João
>
> 2018-03-20 12:47 GMT-07:00 Jared Adolf-Bryfogle <jadolfbr at gmail.com>:
>>
>> Hey Ahmad,
>>
>> I have a script called get_seq.py in the bio-jade module, which uses
>> BioPython.
>>
>> pip install bio-jade.
>>
>> The script will be installed to your path and you can use get_seq.py
>> --help for more info. You may need to source your bashrc/profile afterward
>> or open a new terminal to see it.
>>
>> If you have any issues, please let me know.  I may need to but out a new
>> version.
>>
>>
>> https://bio-jade.readthedocs.io/en/latest/apps_public_api/apps.public.general.html#get-seq-py
>>
>> https://github.com/SchiefLab/Jade
>>
>> -Jared
>>
>>
>> Jared Adolf-Bryfogle, Ph.D.
>> Research Associate
>> Lab of Dr. William Schief
>> The Scripps Research Institute
>>
>> On Tue, Mar 20, 2018 at 1:42 PM, Ahmad Abdelzaher <underoath006 at gmail.com>
>> wrote:
>>>
>>> Hello,
>>>
>>> Can I generate a properly formatted fasta sequence from the atomic
>>> coordinates of a pdb file? I sort of know how to code it, but hopefully
>>> there's some ready method in one of Biopython's modules that can do that.
>>>
>>> Any other suggestions?
>>>
>>> Regards.
>>>
>>> _______________________________________________
>>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>>> http://mailman.open-bio.org/mailman/listinfo/biopython
>>
>>
>>
>> _______________________________________________
>> Biopython mailing list  -  Biopython at mailman.open-bio.org
>> http://mailman.open-bio.org/mailman/listinfo/biopython
>
>
>
> _______________________________________________
> Biopython mailing list  -  Biopython at mailman.open-bio.org
> http://mailman.open-bio.org/mailman/listinfo/biopython



More information about the Biopython mailing list