[Bioperl-l] Assembly::IO write_assembly and remove_seq
Janet Young
jayoung at fhcrc.org
Mon Aug 20 22:09:04 UTC 2007
Hi all,
I realized last week that write_assembly isn't implemented in
Assemble::IO
(see http://bioperl.org/pipermail/bioperl-l/2006-May/021619.html )
I know this has been asked before, but I wondered if anything has
changed - does anyone have any plans to write a write_assembly
method? Alternatively, any suggestions for an alternative solution to
what I'm trying to do?
I'm trying to write a script to make improvements to the assembly
that phredPhrap comes out with - it seems to quite frequently throw
an unrelated sequence into a contig with either no matching sequence
at all, or very little matching sequence. Mysterious. Anyway, my
script can recognize the bad sequences easily enough, and thought I'd
be able to remove them and then write the modified assembly. No joy.
One very inelegant solution I've played with is that I can add some
"markedHighQuality" tags to the discrepant sequences in the ace file,
meaning that next time phredPhrap is run, it sometimes manages not to
assemble the sequences that shouldn't be there. I'm not sure this
will work in all cases, and it seems like quite an unsatisfactory way
to do it.
For the same reason, I'm hoping someone can tell me what remove_seq
does to a contig object? I'm using it and I don't get any error
messages (returns 1), but when I check the contig object afterwards
with get_seq_ids, the sequence I wanted to remove didn't seem to go
away. Also, when I check out the primary_tags for that contig in the
objects returned by get_features_collection, nothing seems to have
changed. So I'm not sure whether the sequence really was removed from
anything at all, and if it was, which object did it get removed
from? (a snippet of my code is below)
my @seqids = $contig->get_seq_ids();
print OUT "seqids @seqids\n";
my $seqobj = $contig->get_seq_by_name($seq);
$contig->remove_seq($seqobj) || die "failed to remove seq\n";
@seqids = $contig->get_seq_ids();
print OUT "seqids @seqids\n";
thanks for any advice,
Janet Young
-------------------------------------------------------------------
Dr. Janet Young (Trask lab)
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., C3-168,
P.O. Box 19024, Seattle, WA 98109-1024, USA.
tel: (206) 667 1471 fax: (206) 667 6524
email: jayoung at fhcrc.org
http://www.fhcrc.org/labs/trask/
-------------------------------------------------------------------
More information about the Bioperl-l
mailing list