[Bioperl-l] Parsing CAP3 output to Fasta

Chris Fields cjfields at uiuc.edu
Thu Dec 20 05:54:36 UTC 2007


On Dec 19, 2007, at 11:39 PM, Dave Messina wrote:

> Hi Ki,
>
> Hopefully someone who (unlike me) uses these modules regularly will  
> chime
> in, but in the meantime, here are some ideas:
>
> The Bio::AssemblyIO module can read and write ace files, which CAP3  
> can
> produce as output. I don't think there is an explicit means to dump  
> to a
> multi-fasta file like you want.
>
> But you could probably write a Bio::AssemblyIO::Fasta class which  
> could
> write the multi-Fasta format you want. Then you could use  
> Bio::AssemblyIO
> objects to read in ace files from CAP3 and write out to multi-fasta.
>
> Look at
>
> Bio::AssemblyIO::*
> Bio::Assembly::ScaffoldI
> Bio::Assembly::Contig
> Bio::LocatableSeq
> Bio::AlignIO
>
> Assemblies are made of scaffolds, scaffolds are made of contigs, and  
> contigs
> are made of sequences which can be manipulated like any old seq in  
> BioPerl.
> Bio::AlignIO can read and write multiple sequence alignments and
> multi-fastas, so that should help you to get from AssemblyIO to your  
> desired
> output format.
>
>
>
> Hope this helps,
> Dave

What would help is to make Bio::Assembly::Contig implement Bio::AlignI  
correctly, or make it a subclass of Bio::SimpleAlign.  That way one  
could read in Scaffolds in via Bio::Assembly::IO and write out Contigs  
through Bio::AlignIO directly.  In theory that should work but IIRC it  
doesn't.

chris



More information about the Bioperl-l mailing list