[BioPython] what to use for working with fasta sequences and alignments?
Jan Kosinski
kosa at genesilico.pl
Wed Jan 10 16:54:23 UTC 2007
Hi,
Thank you, things are becoming clear for me. I have just found nice
explanation here (especially the figures):
http://www.pasteur.fr/recherche/unites/sis/formation/python/ch11s03.html
I like the effort you take to extend capabilities of SeqIO.
And I will stay with Biopython ;-) CoreBio is definitely not so powerful.
Janek
Peter (BioPython List) wrote:
> Jan Kosinski wrote:
>> Hi,
>>
>> I am quite new in BioPython and I am a little bit confused when
>> trying to use BioPython for working with fasta sequences and alignments.
>>
>> For instance, I can read and parse fasta files with Bio.Fasta, return
>> records (as Fasta.record class), iterate and so on. But then I am
>> going to Bio.Fasta.FastaAlign module which offers FastaAlignment
>> (subclass of Alignment class) class. However, this class has very
>> limited methods and get_all_seqs and get_seq_by_num return SeqRecord
>> object instead of Fasta.record (why??) what makes it hard to use
>> Bio.Fasta.FastaAlign (with SeqRecord) for alignments with Bio.Fasta
>> (with Fasta.record) for sequences. Maybe I am wrong but Biopython
>> seems to be full of incompatibilities. Or one should know which
>> modules and classes should not be used?
>>
>> Could you recommend me what should I use for my work with fasta
>> sequences and alignments? Which BioPython modules and classes?
>
> You can use Bio.Fasta to read in files either as Fasta.Record objects,
> or as SeqRecord objects. I would use SeqRecord objects - they are
> more general should you ever want to use a different input file format
> - plus as you have noticed, the alignment object also uses SeqRecord
> objects to hold each (gapped) sequence.
>
> There are other options if you search the code - but Bio.Fasta is the
> best documented and most used.
>
> If you are brave, then you might have a look at the new code in
> Bio.SeqIO which you can get from CVS. This is still in a state of
> flux however... but the Fasta parsing is much faster. See this page
> and the mailing list archives for more:
>
> http://www.biopython.org/wiki/SeqIO
>
> > Or should I use other packages like CoreBio?
>
> You could do - it has the advantage of having started recently from a
> clean slate, and having much less "old code".
>
>> Thank you in advance for any guidelines,
>> Janek Kosinski
>
> Peter
More information about the Biopython
mailing list