[Bioperl-l] Trouble retrieving multiple sequences from NCBI in a single list query

jluis.lavin at unavarra.es jluis.lavin at unavarra.es
Wed Nov 4 08:43:35 UTC 2009


Hello all,

I´m a newbie who is having terrible troubles trying to retrieve a list
multiple sequences from the NCBI and write them to a single file in Fasta
format.
The code I´ve written seems to read mylist and retrive the sequences, but
it kinda overwrites them so that I only get the last sequence on the list.
I´ve been told to ask the people on this mailing list for help, since you
may have come across this problem also or at last will know how to solve
it...

Here is my code, which basically consist on an STDIN for the list to be
read into an array and a loop to read each sequence (stopping when the
list ends) and retrieve a sequence each time the loop is launched,
writting that sequence to a fasta file. I only get a sequence back
although it seems to perform the retrieving process with each of the
sequences of the list...


#!/usr/bin/perl -w
use strict;
use Bio::DB::GenPept;
use Bio::DB::GenBank;
use Bio::SeqIO;
print "Enter your list name:";
my $archivo=<STDIN>;
chomp $archivo;
die ("Can´t open input\n") unless (open(INFILE, $archivo));
my @lista = <INFILE>;
foreach my $seq (@lista) {
    if ($seq eq '') {
        die ("empty list")
        }
    else {
my $db = new Bio::DB::GenPept("-format" => "Fasta");
my $seqobj = $db->get_Seq_by_acc($seq);
my $out = new Bio::SeqIO (-file => ">extracted_seqs.fasta",
-format => 'fasta');
$out->write_seq($seqobj);
}
}
exit;


An example list of sequences can be this one:

YP_003107578.1
YP_003106103.1
YP_003106552.1
YP_003106560.1
YP_003107053.1
YP_003107450.1
YP_003108000.1
YP_003105023.1
YP_003105264.1

Thanks in advance for your help ;)

-- 
José Luis Lavín Trueba, PhD

Dpto. de Producción Agraria
Grupo de Genética y Microbiología
Universidad Pública de Navarra
31006 Pamplona
Navarra
SPAIN





More information about the Bioperl-l mailing list