[Bioperl-l] fasta format

Hilmar Lapp hlapp@gnf.org
Mon, 26 Aug 2002 10:52:20 -0700


And besides, what does Biojava do? I guess the answer is the event parser.

Which is actually what I propose to do, too. We'd have a

	package Bio::Seq::DescriptionParserI

	# takes 2 arguments:
	#        - a Bio::PrimarySeqI implementing object
	#        - a string
	#
	# Parses the string and accordingly to the content found
	# sets attributes of the sequence object.
	# Returns the sequence object.
	#
	sub parse_header {
		shift->throw_not_implemented();
	}

And then people can supply whatever they want. We may need this soon enough to parse rich fasta headers which some folks like to abuse the fasta header for.

	-hilmar

> -----Original Message-----
> From: Hilmar Lapp 
> Sent: Monday, August 26, 2002 10:33 AM
> To: Wiepert, Mathieu; Matthew Pocock; Paul Gordon
> Cc: bioperl-l@bioperl.org
> Subject: RE: [Bioperl-l] fasta format
> 
> 
> 
> 
> > -----Original Message-----
> > From: Wiepert, Mathieu [mailto:Wiepert.Mathieu@mayo.edu]
> [...]
> > 
> > >\s+(.*)
> > is valid, as described by Bill Pearson.  Should have null ID, 
> > then description.
> > 
> 
> I'm concerned about making this change. It radically changes 
> the behaviour of the parser, even if this interpretation is 
> the correct one (Bill, could you clarify?)
> 
> The reason I'm concerned is that I have seen many people 
> putting a space between '>' and the ID when the 
> copy-and-paste sequences, believe it or not, and be it 
> correct or not. My point is that every web-server written 
> using bioperl will break after this change when users enter a 
> space between ID and '>', whereas it handled the situation 
> fine before.
> 
> Looking elsewhere, the EMBOSS seqret (and hence entire EMBOSS 
> I guess) does ignore whitespace between '>' and ID and takes 
> the first word as the ID.
> 
> So that makes my second concern: I don't want bioperl behave 
> much different than other sequence analysis toolkits.
> 
> 	-hilmar
> 
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l@bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>