[Bioperl-l] ace.pm
Wes Barris
wes.barris at csiro.au
Mon Feb 16 22:35:49 EST 2004
Jason Stajich wrote:
>
> People write code and modules to support the work they are doing,
> sometimes for a specific data set - so I suspect Robson wrote this to
> support phrap ace format which has a convention of them being ContigXX.
>
> You are welcome to make changes to code on your local system to get it
> working and then post the diffs so they can be incorporated back in. Why
> not try changing the code as you have noticed and seeing if it works. It
> is a collaborative project and these modules are newish, so give a try
> fixing things and then getting feedback on your fixes.
I have modified one line in Bio/Assembly/IO/ace.pm as shown below:
# Loading contig sequence (COntig sequence field)
# (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig
found!
(/^CO (\w+) (\d+) (\d+) (\d+) (\w+)/) && do { # New contig found!
The change will cause the contigID to be whatever the second field of
this line is (CO CL15Contig1 794 4 0 U). In this case, it would be
set to "CL15Contig1".
>
> -jason
>
> On Tue, 17 Feb 2004, Wes Barris wrote:
>
> > Hi,
> >
> > ACE files generated by an application called tgicl have "CO"
> > lines of the form:
> >
> > CO CL15Contig2 794 4 0 U
> >
> > This line is not parsed properly by the ace.pm bioperl module.
> > Notice this line from Bio/Assembly/IO/ace.pm .
> >
> > (/^CO Contig(\d+) (\d+) (\d+) (\d+) (\w+)/) && do { # New
> > contig found!
> >
> > Bioperl expects the second "word" in the line to be "Contig\d+" where
> > the number is used as the "contigID". Is there a reason why
> > "contigID" must be a number? Why can't it be the whole second
> > "word" of the "CO" line?
> >
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
>
--
Wes Barris
E-Mail: Wes.Barris at csiro.au
More information about the Bioperl-l
mailing list