[BioPython] parsing fasta file to list of sequences
Frank Kauff
fkauff at duke.edu
Wed Apr 13 14:22:26 EDT 2005
Hi Faheem,
On Wed, 2005-04-13 at 14:04 -0400, Faheem Mitha wrote:
> Hi,
>
> I was needing to parse a fasta file into a list of sequences in the following
> fashion.
>
> [['acg'], ['tac'],...]
>
> where each entry is a different sequence. Is this easily possible with the
> current parsing tools? If so, would somone be kind enough to sketch an
> approach? Thanks in advance.
>
[fkauff at osiris align]$ cat fasta
>one
AAAAA
>two
CCCCCC
>three
GGGGGGGG
>>> from Bio import SeqUtils
>>> fasta=SeqUtils.quick_FASTA_reader('fasta')
>>> names,seqs=zip(*fasta)
>>> names
('one', 'two', 'three')
>>> seqs
('AAAAA', 'CCCCCC', 'GGGGGGGG')
or to get exactly what you wanted
>>> seqs2=[[s[1]] for s in fasta]
>>> seqs2
[['AAAAA'], ['CCCCCC'], ['GGGGGGGG']]
Frank
> Faheem.
> _______________________________________________
> BioPython mailing list - BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
--
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA
Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net/member/frankkauff.shtml
More information about the BioPython
mailing list