[Bioperl-l] new-person question
Clay Shirky
clay@shirky.com
Sun, 24 Sep 2000 20:41:43 -0400 (EDT)
> I have a loop that takes each line, and if the line starts with '>',
> should store that line in an array that will contain only sequence names.
> However, the '>' is causing problems. The coding is:
>
> while (<>)
> {
> $templine = $_;
> if ($templine =~ /\b>/)
> {
> other stuff;
> }
> }
>
> and it works for perfectly if I use a character, such as '>', but not with
> '>'.
I'm not sure I understand this last sentence, since you reference >
twice, but > is not a special character in perl character classes.
It _is_ a special character, "STDOUT redirect with file overwrite", in
Unix, however, so if you are testing the program with input on the
command line, you may get problems there.
Also, a better way to specify FASTA title lines is /^>/, which is to
say "Lines where > is the first character."
If you had a script like
while (<>) {
if (/^>/) { # no need to assign temp variable
print;
}
}
it should print the title line from a FASTA format file and no others.
bioperl obviously provides many more ways to deal with FASTA data, but
I hope this answers your perl question.
-clay