[Bioperl-l] Write a fasta file with custom title line.

Chris Fields cjfields at uiuc.edu
Fri Sep 1 15:34:04 UTC 2006


Amir,

What you describe (ID + description) can be done within Bioperl directly
using SeqIO w/o building a class.

When using method preferred_id_type() you can choose between primary ID,
accession, accession.version, or display ID. One way to customize the output
is by using 'display' for the preferred_id_type() argument and passing
display_id() the customized string value.  The description is automatically
appended every time.  We could add an additional parameter to make appending
the description optional if anyone is interested (should be fairly
straightforward to add).  

I posted this previously but here is the demo again:

use Bio::SeqIO;

my $seqin = Bio::SeqIO->new(-file     => shift @ARGV,
                            -format => 'genbank');

my $seqout = Bio::SeqIO->new(-fh => \*STDOUT,
                            -format => 'fasta');

# From Bio::SeqIO::fasta

$seqout->preferred_id_type('display');

my $ct = 1;

while (my $seq = $seqin->next_seq) {
    # override the regular display_id with your own
    $seq->display_id('foo'.$ct); 
    $seqout->write_seq($seq);
    $ct++;
}


Chris

> -----Original Message-----
> From: bioperl-l-bounces at lists.open-bio.org [mailto:bioperl-l-
> bounces at lists.open-bio.org] On Behalf Of Amir Karger
> Sent: Friday, September 01, 2006 9:55 AM
> To: bioperl-l
> Subject: Re: [Bioperl-l] Write a fasta file with custom title line.
> 
> > From: Siddhartha Basu [mailto:basu at pharm.sunysb.edu]
> >
> > Staffa, Nick (NIH/NIEHS) [C] wrote:
> > > I would like to construct  title lines for the fasta
> > > sequences I want to right to a file.
> > > I don't see in the documentation on-line for SeqIO or
> > > write_seq how to specify this.
> > > Please point the way.
> >
> > Hi Nick,
> >
> > You could use Bio::Seq::BaseSeqProcessor to customize the title line.
> > Write your own title processing class which should inherit from
> > Bio::Seq::BaseSeqProcessor overriding its "process_seq"
> > method.
> 
> I think requiring someone to write a whole class just to get their
> favorite FASTA output is a bit much, don't you? They might as well just
> explicitly print out '>', the ID, a space, the description, a newline,
> and the sequence (while Bioperlers are adding a description setter in
> the FASTA writer, if there isn't one).
> 
> -Amir
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list