[Bioperl-l] fasta format
Lincoln Stein
lstein@cshl.org
Mon, 26 Aug 2002 11:02:22 -0400
I apologize for the previous message; I didn't see Bill's response before I
sent it.
As I understand it, FASTA does makes a distinction between the ID and the
description (Bill, please confirm). The regular expression to match the two
is:
/^>(\S+)\s+(.*)$/
So, given that Bill has confirmed that empty IDs are valid, if there is a
space after the ">", then what comes afterward should be interpreted as the
description, not the ID.
Lincoln
On Friday 23 August 2002 05:38 pm, Wiepert, Mathieu wrote:
> > I have seen many people use the perfectly acceptable
> >
> > > [blanks] description 1
> >
> > asdf
> >
> > > description2
> >
> > qwerty
>
> Thanks for he explanation, that sounds very reasonable, and I think is what
> should be implemented. If there is a space, I would not expect the first
> word of the description to become my id. For instance, given a header like
> this
>
> > Hi I am the header description
>
> asdf
>
> bioperl makes 'Hi' the id. This is because
>
> my ($id,$fulldesc) = $top =~ /^\s*(\S+)\s*(.*)/
>
> parses " Hi I am the header description" that way. I would not expect
> that behavior.
>
> If anyone can recall why this might be, let me know. I saw some threads on
> what to do with a blank sequence, nothing with a blank header, or a header
> missing an id. If people like it the way it is, I can put a comment in the
> code to that effect. However, I would hate not to touch it just because
> people can't remember why it is the way it is.
>
> I'll mess around and execute the test scripts, see if those break with any
> of the changes I was testing.
>
> -Mat
>
> > Bill Pearson
> >
> > _______________________________________________
> > Bioperl-l mailing list
> > Bioperl-l@bioperl.org
> > http://bioperl.org/mailman/listinfo/bioperl-l
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
--
========================================================================
Lincoln D. Stein Cold Spring Harbor Laboratory
lstein@cshl.org Cold Spring Harbor, NY
========================================================================