[Biopython] Problem with parsing strand in Homo_sapiens.GRCh37.68 genbank files
Susan Wilson
smwilson at hpc.unm.edu
Tue Aug 14 14:54:24 UTC 2012
Hi Peter,
Thanks for quick response. I have downloaded the files from
ftp://ftp.ensembl.org/pub/release-68/genbank/homo_sapiens/. Got version
1.53 of biopython. Maybe I should try 1.6? Here's some diagnostics:
$ head Homo_sapiens.GRCh37.68.chromosome.1.dat
LOCUS 1 249250621 bp DNA HTG 14-JUL-2012
DEFINITION Homo sapiens chromosome 1 GRCh37 full sequence 1..249250621
reannotated via EnsEMBL
ACCESSION chromosome:GRCh37:1:1:249250621:1
VERSION 1GRCh37
KEYWORDS .
SOURCE human
ORGANISM Homo sapiens
.
COMMENT This sequence was annotated by the Ensembl system. Please
visit the
Output from ipython:
import sys
sys.version_info
Out[3]: (2, 6, 5, 'final', 0)
sys.version
Out[4]: '2.6.5 (r265:79063, Apr 16 2010, 13:57:41) \n[GCC 4.4.3]'
import Bio
print Bio.__version__
1.53
On 08/14/2012 08:46 AM, Peter Cock wrote:
> On Tue, Aug 14, 2012 at 3:10 PM, Susan Wilson <smwilson at hpc.unm.edu> wrote:
>> Hi,
>>
>> I am parsing the gb files with biopython. My problem is that none of the
>> seqfeature.strand values are returning the plus strand (value == 1).
> That should happen with a protein sequence.
>
>> The commands below are a bit fabricated. (For instance, I have left out the
>> opening and closing of fout.) I have read in
>> Homo_sapiens.GRCh37.68.chromosome.1.dat using SeqIO.read.
> What URL are you getting that file from?
>
> Which version of Biopython are you using? There were some strand
> related changes recently (internally moving it from the SeqFeature to
> the SeqFeature's location object).
>
> Thanks,
>
> Peter
More information about the Biopython
mailing list