[Bioperl-l] Bio::AlignIO ignores questionmarks?
David Messina
dmessina at wustl.edu
Fri Apr 14 05:14:25 UTC 2006
Hi Kai,
I'm by no means an expert with this module, but I'll take a shot.
Running your code through a debugger, I'm seeing that
Bio::AlignIO::fasta is gobbling the question marks:
line 66: $MATCHPATTERN = '^A-Za-z\.\-';
and then where $entry contains a line of sequence from the input file
line 118: $entry =~ s/[$MATCHPATTERN]//g;
As far as I can tell, a question mark is not a valid character for
the FASTA format (see http://en.wikipedia.org/wiki/FASTA_format) --
perhaps that's the reason Bio::AlignIO::fasta doesn't permit them?
And then by the time missing_char() is applied, the question marks
are already gone.
What happens if you read in your sequence with question marks in a
format that explicitly permits question marks?
Dave
On Apr 13, 2006, at 7:38 PM, Kai Müller wrote:
> hi,
>
> I'm very new to BioPerl and have a maybe silly question.
> when using Bio::AlignIO to load a set of sequences, the
> questionmarks are
> simply lost (they refer to missing characters as opposed to gap
> characters
> [-] or ambiguity [N]). I thought that 'missing_char()' might help,
> but it
> didn't (I probably used it the wrong way).
>
> when $filename contains sequences with ????, the following snippet
> would
> produce an alignment with ???? lost and downstream nucleotide just
> shifted
> and the resulting length differnces filled by '---' @ 3' end:
>
>
> my $aln_in = Bio::AlignIO->new(-file => "$filename", '-format' =>
> 'fasta');
> my $aln = $aln_in->next_aln();
> $aln->gap_char('-');
> $aln->missing_char('?');
>
> my $testout = Bio::AlignIO->new(-fh => \*STDOUT , '-format' =>
> 'clustalw');
> $testout->write_aln($aln);
>
>
>
> Can somebody give me a hint here?
>
> thanks and all the best,
>
> Kai Müller
>
>
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l
More information about the Bioperl-l
mailing list