[Bioperl-l] Retrieve FASTA seqs with NCBI definition line

Heikki Lehvaslaiho heikki at ebi.ac.uk
Mon Apr 7 14:25:40 EDT 2003


Mikaela,

I do not think there is a shortcut and I do not think modifying the the
way fasta is processed from the sequence information is a good idea. So
many things depend on it. You have to manually modify the return values
of methods: display_id() and  desc().

Assuming you go ahead and do it, could you put the code into a function
(seq2NCBIfasta  ? ) which could be added into Bio::SeqUtils?

Then anyone needing to do the same thing could:

$out = Bio::SeqIO->new(-format => 'fasta');
$out->write_seq(Bio::SeqUtils->seq2NCBIfasta($seq));

Cheers,
	-Heikki


On Mon, 2003-04-07 at 12:06, Mikaela Ilinca Gabrielli wrote:
> Dear all, 
> 
> I'd like to retrieve sequences from GenPept that are in fasta format AND
> include the NCBI definition line. I thought this was easy but as I apply
> Bio::DB::GenPept I get only a part of the NCBI definition line - missing gi
> and accession number information.
> 
> ex def-line from NCBI:
> >gi|4504379|ref|NP_003658.1| G protein-coupled receptor 49; orphan G
> protein-coupled receptor HG38; G protein-coupled receptor 67 [Homo sapiens] 
> 
> ex defline retrieved through Bio:
> 
> >GPR49 G protein-coupled receptor 49; orphan G protein-coupled receptor
> HG38; G protein-coupled receptor 67 [Homo sapiens]
> 
> Is there any easy way to get around this or do I have to use
> '$seq->primary_id' and '$seq->accession_number' to "cut&paste" my own fasta
> records that look like those in NCBI ?
> 
> Best regards,
> 
> Mikaela
> 
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
-- 
______ _/      _/_____________________________________________________
      _/      _/                      http://www.ebi.ac.uk/mutations/
     _/  _/  _/  Heikki Lehvaslaiho          heikki at ebi.ac.uk
    _/_/_/_/_/  EMBL Outstation, European Bioinformatics Institute
   _/  _/  _/  Wellcome Trust Genome Campus, Hinxton
  _/  _/  _/  Cambs. CB10 1SD, United Kingdom
     _/      Phone: +44 (0)1223 494 644   FAX: +44 (0)1223 494 468
___ _/_/_/_/_/________________________________________________________



More information about the Bioperl-l mailing list