[Biopython-dev] [Biopython (old issues only) - Bug #2819] (Migrated) Bio.SeqIO support for NCBI protein tables (*.ptt files)

redmine at redmine.open-bio.org redmine at redmine.open-bio.org
Thu Jul 5 14:39:26 UTC 2018


Issue #2819 has been updated by Peter Cock.

Description updated
Status changed from New to Migrated
URL set to https://github.com/biopython/biopython/issues/1725

Migrated to GitHub as https://github.com/biopython/biopython/issues/1725

----------------------------------------
Bug #2819: Bio.SeqIO support for NCBI protein tables (*.ptt files)
https://redmine.open-bio.org/issues/2819#change-15419

* Author: Peter Cock
* Status: Migrated
* Priority: Normal
* Assignee: Biopython Dev Mailing List
* Category: Main Distribution
* Target version: Not Applicable
* URL: https://github.com/biopython/biopython/issues/1725
----------------------------------------
On their FTP site the NCBI provide a range of files for each genome/plasmid/chromosome, e.g.
ftp://ftp.ncbi.nih.gov/genomes/Protozoa/Cryptosporidium_parvum/

The *.ptt files are simple tab separated tables listing all the proteins.  They correspond to the CDS features in the GenBank file.

This enhancement bug is about adding "ptt" as an input file format in Bio.SeqIO (and potentially as an output format too), where a single ptt file gives a single SeqRecord object containing a SeqFeature object for each protein.  The header line gives the sequence length, so an UnknownSeq can be used for the SeqRecrd's seq property.

One example application of this would be to draw a GenomeDiagram showing the protein locations.  This can be done using the SeqFeature objects from parsing a GenBank file, but using the ptt file will be much faster.

See earlier suggestions on the mailing list (part of the GFF thread):
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005725.html
http://lists.open-bio.org/pipermail/biopython-dev/2009-April/005745.html

Patch to follow...

---Files--------------------------------
ProteinTableIO.py (8.12 KB)
add_ptt.patch (486 Bytes)
test_ptt.patch (957 Bytes)


-- 
You have received this notification because you have either subscribed to it, or are involved in it.
To change your notification preferences, please click here and login: http://redmine.open-bio.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.open-bio.org/pipermail/biopython-dev/attachments/20180705/834d82e5/attachment.html>


More information about the Biopython-dev mailing list