[BioPython] parsing fasta file to list of sequences

Wed Apr 13 14:22:26 EDT 2005

Hi Faheem,

On Wed, 2005-04-13 at 14:04 -0400, Faheem Mitha wrote:
> Hi,
> 
> I was needing to parse a fasta file into a list of sequences in the following 
> fashion.
> 
> [['acg'], ['tac'],...]
> 
> where each entry is a different sequence. Is this easily possible with the 
> current parsing tools? If so, would somone be kind enough to sketch an 
> approach? Thanks in advance.
> 

[fkauff at osiris align]$ cat fasta
>one
AAAAA
>two
CCCCCC
>three
GGGGGGGG

>>> from Bio import SeqUtils
>>> fasta=SeqUtils.quick_FASTA_reader('fasta')
>>> names,seqs=zip(*fasta)
>>> names
('one', 'two', 'three')
>>> seqs
('AAAAA', 'CCCCCC', 'GGGGGGGG')

or to get exactly what you wanted

>>> seqs2=[[s[1]] for s in fasta]
>>> seqs2
[['AAAAA'], ['CCCCCC'], ['GGGGGGGG']]

Frank

>                                                                     Faheem.
> _______________________________________________
> BioPython mailing list  -  BioPython at biopython.org
> http://biopython.org/mailman/listinfo/biopython
-- 
Frank Kauff
Dept. of Biology
Duke University
Box 90338
Durham, NC 27708
USA

Phone 919-660-7382
Fax 919-660-7293
Web http://www.lutzonilab.net/member/frankkauff.shtml