[Bioperl-l] Getting entire descriptions from FASTA files

Marc Logghe Marc.Logghe at devgen.com
Wed Jan 12 07:53:21 EST 2005


Hi David,

> I want to get the entire description line in FASTA format (i 
> mean: description, accession number, etc...). Ive tried with 
> display_id this way:
> 
> my $seq_inIO     = Bio::SeqIO->new(-file => "$proteasa",
>                          -format => 'Fasta');
> 
> my $seq_in        = $seq_inIO->next_seq();
> 
> my $id_peptid = $seq_in->display_id;
>  
> but I only obtain the gi and gb numbers, not the description line.
> 
> Then, I tried with $seq_in->desc instead of 
> $seq_in->display_id , but then I only obtain the description 
> (or part of it). 
> 
> Is there a way to get the entire description line the same 
> way you see it at the FASTA file?

You can reconstruct it by concatenating the id and description:
my $fasta_line = join ' ', $seq_in->display_id, $seq_in->desc;

Of course, I don't know what's the purpose of your script, but if it is only to fetch the > line, why not just a plain-ol' grep ? something like:
grep '^>' /your/fastafile | sed "s/^>//"

HTH,
Marc



More information about the Bioperl-l mailing list