[Bioperl-l] Parsing phred/phrap outputs
nkuipers
nkuipers@uvic.ca
Tue, 12 Nov 2002 10:31:41 -0800
Hi Alberto,
I'm a bioperl newby too so I hope I don't end up embarassing myself with this
response. The .contigs file from PHRAP (at least on my system, as called by
>phrap -ace somefile) is already in fasta format. So it sounds like all you
need is a script that removes the XXX... substrings? If I don't understand
your question properly, stop reading now. :)
You could use Bio::SeqIO for this, although there are bioperl phred/phrap
modules (which I haven't used yet). The following untested script should do
it:
use Bio::SeqIO;
my $in = Bio::SeqIO->new( -file => "<./file.contigs" -format => 'Fasta' );
my $out = Bio::SeqIO->new( -file => ">>trimmed.contigs" -format => 'Fasta' );
while ( my $seq = $in->next_seq() ) {
$seq->seq =~ s/[X]//g;
#Not sure if the object can be inlined as above in the substitution.
#If not, you could make a temp var that does the substitution and then set
#$seq->seq to the temp. Blech!
$out->write_seq( $seq );
}
Cheers,
Nathanael Kuipers
---
Center for Biomedical Research,
Dept. of Biology,
University of Victoria
>===== Original Message From "Alberto M. R. Davila"
<davila@gene.dbbm.fiocruz.br> =====
>Dear All,
>
>I am new to BioPerl so please be patience with me... :-)
>
>Running phredPhrap I got the seqs containing vectors marked with "XXXX" (in
>the "file_name.contigs" and "file-Name.fasta.screen" output files) .... I
>has been unable to find the complete seq of the pMOS cloning vector on
>Internet (even at the Amersham site), then I wonder to know any could have
>a) any script to parse such "XXXX" and get the seqs in fasta format and/or
>b) the complete seq of the pMOS cloning vector.
>
>Thanks in advance for any help you may provide.
>
>Kind regards,
>
>Alberto
>
>_______________________________________________
>Bioperl-l mailing list
>Bioperl-l@bioperl.org
>http://bioperl.org/mailman/listinfo/bioperl-l