[BioPython] Fasta parser, minor (bug/feature?)
Peter Wilkinson
pwilkinson_m at xbioinformatics.org
Wed Aug 24 16:43:15 EDT 2005
It seems that the fasta parser retains the os specific line endings when it
stores the title and sequence in the Record object, so I have to write out
something like this when I read a file from working in windows (eeeeek),
then display using a true text editor like Context:
file_out.writelines(str(cur_record).replace('\r',''))
... because all the line endings are '\r\n', and are displayed in the text
editor as 2 returns, or double spacing the text when written to file
instead of single space:
>gi|272209|gb|M61959.1| EST00007 Fetal brain, Stratagene (cat#936206) ...
CTTCCCTTTTGTTCCCCTCAGTGTCCCTTTTAATTGCTTCCCTCCATTTTCCTTAGCAGC
ATCCTAGTTGATGGTCTGGGTTATCAGAGGAGCAAAAACATTTAAGTGTCAAATAATGCT
CATTGTCTCCCTGGGATTTCTAAACAGAAAAAATGAAGAAAGAGGCAGAGAAGAGCTTCA
Should the behavior to allow both single and os specific line returns be
applied, or just '\n'?
I realise that the Record __str() method uses os.linesep, but when working
with fasta files in a true text editor in windows ... only the \n is
needed. Also I work generally in a mixed environment and the \r\n should
be avoided.
I am unsure why os.linesep is used here. My vote is to just have a plain
'\n' applied to each end of line.
Peter
More information about the BioPython
mailing list