[Bioperl-l] ace to msf format?
Jason Stajich
jason at cgt.duhs.duke.edu
Tue Sep 2 11:30:44 EDT 2003
Perhaps it make sense to instead derive a flushed alignment from a Contig
- i.e. a get_aln() method - which will make a new SimpleAlign object and
padding the individual sequences with the necessary leading and trailing
gap characters?
Wes - if this is something you need, perhaps you could look into trying to
write a method of this sort?
-jason
On Tue, 2 Sep 2003, Robson Francisco de Souza wrote:
>
> Hi Wes and Jason,
>
> There are indeed some caveats when trying to use
> Bio::Assembly::Contig objects as Bio::Align::AlignI objects. Not all
> methods defined in this interface are implemented and some are not
> working (checked it yesterday using Wes's code). Most routines that are
> not working can be corrected without much work and some not yet
> implemented are easy to write but I'm not sure we'll ever get full
> compliance to the AlignI interface.
> I'd like to discuss that further but for now let me just clarify
> why I believe there will be no way to print contig using msf.pm: contigs
> are not flush, i.e. most contigs will be alignments of sequences of
> different lengths and, even worst, sequences in a contig may be only
> locally aligned to each other, which implies that some regions of any
> sequence in the alignment might not be aligned to the contig consensus but
> will get printed to MSF any way. As far as I understand AlignI interface,
> such an alignment (a set of local alignments) is not supported.
> I've been considering removing AlignI from @ISA in
> Bio::Assembly::Contig and defining a ContigI interface for it as it seems
> to me that AlignI interface is not generic enough to describe contigs.
> The main problem is that any sequence in a contig is only partially
> aligned to a consensus's subsequence, qich makes some of the methods from
> AlignI non-sense (e.g. Bio::Align::AlignI::length, which is used by
> msf.pm). I'd like to hear comments from others on this.
> So, do not try to use MSF, CLUSTALW or other format of multiple
> global alignment for printing assemblies, you wont get what you want.
>
> Robson
>
> On Mon, 1 Sep 2003, Wes Barris wrote:
> > Thanks Jason, that makes sense. Perhaps I'm missing something obvious
> > but I am getting an error when treating each contig as a Bio::SimpleAlign
> > object. Here is my code:
> >
> > #!/usr/local/bin/perl -w
> > #
> > use strict;
> > use Bio::Assembly::IO;
> > use Bio::AlignIO;
> > #
> > my $usage = "Usage: $0 <infile.ace>\n";
> > my $infile = shift or die $usage;
> >
> > my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
> > my $assembly = $io->next_assembly;
> >
> > foreach my $contig ($assembly->all_contigs()) {
> > my $name = "cn".$contig->id;
> > print("$name\n");
> > my $outstream = new Bio::AlignIO(-format=>'msf', -file=>">$name");
> > $outstream->write_aln($contig);
> > undef $outstream;
> > }
> >
> > And here is the runtime error:
> >
> > cn1
> > Use of uninitialized value in hash element at
> > /usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
> > Use of uninitialized value in hash element at
> > /usr/lib/perl5/site_perl/5.6.1/Bio/Assembly/Contig.pm line 1305, <GEN0> line 33990.
> > Can't call method "alphabet" on an undefined value at
> > /usr/lib/perl5/site_perl/5.6.1/Bio/AlignIO/msf.pm line 180, <GEN0> line 33990.
> >
> > I am using bioperl-1.2.2.
> >
> >
> > >
> > > Your code below is calling it in scalar context which will just have $aln
> > > being set to the length of the returned array.
> > >
> > > -jason
> > >
> > > On Mon, 1 Sep 2003, Wes Barris wrote:
> > >
> > >
> > >>Brian Osborne wrote:
> > >>
> > >>
> > >>>Wes,
> > >>>
> > >>>I don't think this is possible in Bioperl. To put it more generally, AlignIO
> > >>>can't accommodate Assembly objects currently. AlignIO is the module that
> > >>>takes in a variety of alignment formats and interconverts them, analogous to
> > >>>SeqIO. I'll be corrected if I'm wrong.
> > >>>
> > >>>Brian O.
> > >>
> > >>I am kind of new to this so I could be wrong but isn't an Assembly a group
> > >>of alignments? So, from one assemble, a group of alignments could be
> > >>generated?
> > >>
> > >>
> > >>>-----Original Message-----
> > >>>From: bioperl-l-bounces at portal.open-bio.org
> > >>>[mailto:bioperl-l-bounces at portal.open-bio.org]On Behalf Of Wes Barris
> > >>>Sent: Thursday, August 28, 2003 7:58 PM
> > >>>To: Bioperl Mailing List
> > >>>Subject: [Bioperl-l] ace to msf format?
> > >>>
> > >>>Can anyone give me a hint as to how I could use bioperl to read in
> > >>>an ACE assembly and write out an MSF formatted alignment? This shows
> > >>>what I have figured out so far:
> > >>>
> > >>>#!/usr/local/bin/perl -w
> > >>>#
> > >>>use strict;
> > >>>use Bio::Assembly::IO;
> > >>>#
> > >>>my $usage = "Usage: $0 <infile.ace>\n";
> > >>>my $infile = shift or die $usage;
> > >>>
> > >>>my $io = new Bio::Assembly::IO(-file=>$infile, -format=>'ace');
> > >>>my $assembly = $io->next_assembly;
> > >>>
> > >>>my $aln = $assembly->all_contigs();
> > >>>
> > >>>--
> > >>>Wes Barris
> > >>>E-Mail: Wes.Barris at csiro.au
> > >>>
> > >>>
> > >>>_______________________________________________
> > >>>Bioperl-l mailing list
> > >>>Bioperl-l at portal.open-bio.org
> > >>>http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > >>>
> > >>
> > >>
> > >>
> > >
> > > --
> > > Jason Stajich
> > > Duke University
> > > jason at cgt.mc.duke.edu
> >
> >
> > --
> > Wes Barris
> > E-Mail: Wes.Barris at csiro.au
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l at portal.open-bio.org
> > http://portal.open-bio.org/mailman/listinfo/bioperl-l
> >
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
--
Jason Stajich
Duke University
jason at cgt.mc.duke.edu
More information about the Bioperl-l
mailing list