[Bioperl-l] Pulling down data from NCBI

Abhishek Pratap abhishek.vit at gmail.com
Mon Feb 1 21:33:17 UTC 2010


Thanks Chris.

I was looking at the same thing in cookbook moments ago. Thanks!
-A

On Mon, Feb 1, 2010 at 4:31 PM, Chris Fields <cjfields at illinois.edu> wrote:
> Abhi,
>
> The accession in question is for a record containing a set of sequences, not just one sequence (it's a contig record).  The NCBI web interface is performing an esearch on this to get 34K seqs, the equivalent with EUtilities is:
>
> ================================
> use Bio::DB::EUtilities;
>
> my $id = 'AAPP01000000[ACCN]';
>
> my $factory = Bio::DB::EUtilities->new  (
>       -eutil =>  'esearch',
>       -db    =>  'nucleotide',
>       -term  =>  $id,
>       -usehistory => 'y');
>
> say $factory->get_count;
>
> # do more here...
>
> ================================
>
> The 'do more here' part is covered in the cookbook, and will require you retrieving the seqs in chunks.
>
> chris
>
> On Feb 1, 2010, at 2:45 PM, Abhishek Pratap wrote:
>
>> Thank you guys for very quick responses.  My bad I trusted my fingers.
>>
>> Now that this is working the output that I am getting is not what I
>> want. I am sure I am missing the correct way of doing it. So If I
>> search the Nucleotide db @NCBI for this accession number "
>> AAPP01000000", I see some 34 k records. What I need to do is pull down
>> those sequences as fasta files.
>>
>> I am referring to
>> http://www.bioperl.org/wiki/HOWTO:EUtilities_Cookbook  but dint quite
>> find a similar example.
>>
>> Thanks!
>> -Abhi
>>
>> On Mon, Feb 1, 2010 at 3:39 PM, Kevin Brown <Kevin.M.Brown at asu.edu> wrote:
>>> Looks like you've misspelled one of the parameters. It should be
>>> 'efetch' not 'efecth'
>>>
>>> Kevin Brown
>>> Center for Innovations in Medicine
>>> Biodesign Institute
>>> Arizona State University
>>>
>>>> -----Original Message-----
>>>> From: bioperl-l-bounces at lists.open-bio.org
>>>> [mailto:bioperl-l-bounces at lists.open-bio.org] On Behalf Of
>>>> Abhishek Pratap
>>>> Sent: Monday, February 01, 2010 1:36 PM
>>>> To: bioperl-l at lists.open-bio.org
>>>> Subject: [Bioperl-l] Pulling down data from NCBI
>>>>
>>>> Hi All
>>>>
>>>> I looking to batch download some 34K nucleotide sequences,
>>>> corresponding to a NCBI accession number. I tired the following and
>>>> getting an error. Has it got anything to do with recent update to code
>>>> that Chris was discussing.
>>>>
>>>>
>>>>
>>>> my $factory = Bio::DB::EUtilities->new  (
>>>>                                       -eutil  => 'efecth',
>>>>                                       -db     =>      'nucleotide',
>>>>                                       -retype =>      'fasta',
>>>>                                       -id             => $id
>>>>                               );
>>>>
>>>>
>>>> ----------- EXCEPTION: Bio::Root::Exception -------------
>>>> MSG: efecth not supported
>>>> STACK: Error::throw
>>>> STACK: Bio::Root::Root::throw
>>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/Root.pm:357
>>>> STACK: Bio::Tools::EUtilities::EUtilParameters::eutil
>>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Tools/EUtilities/EUtilPar
>>>> ameters.pm:452
>>>> STACK: Bio::Root::RootI::_set_from_args
>>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Root/RootI.pm:546
>>>> STACK: Bio::Tools::EUtilities::EUtilParameters::new
>>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/Tools/EUtilities/EUtilPar
>>>> ameters.pm:193
>>>> STACK: Bio::DB::EUtilities::new
>>>> /usr/lib/perl5/vendor_perl/5.8.8/Bio/DB/EUtilities.pm:74
>>>> STACK: ./getDatafromNCBI.pl:9
>>>>
>>>>
>>>> -Abhi
>>>> _______________________________________________
>>>> Bioperl-l mailing list
>>>> Bioperl-l at lists.open-bio.org
>>>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>




More information about the Bioperl-l mailing list