[Bioperl-l] Bio::DB::GenPept server error
Ewan Birney
birney at ebi.ac.uk
Mon Feb 3 07:53:18 EST 2003
On Mon, 3 Feb 2003, Neil Saunders wrote:
> Dear all,
>
> Still having problems with Bio::DB::GenPept, get_Stream_by_id().
>
> I have a test file, containing 3 UIDs separated by commas. If I read in
> this file and assign it to an array:
>
> open IN,'test.file';
> @array=<IN>;
>
> then my code works fine and retrieves what I want using \@array.
>
> Now I move to my real file, which contains about 112 000 UIDs. Same
> procedure and I get:
>
> MSG: WebDBSeqI Request Error:
> 500 (Internal Server Error) short write
>
> Is this because the server doesn't like such a large file, or some other
> problem? Should I even be using this module to retrieve 112 000
> records? I would get them using fastacmd from a local nr database, but
> the required -i option seems to be broken (gives duplicate records).
>
Getting 112 000 records over the web is going
(a) take a while
(b) be horribly inefficient
(c) do nasty things to the webserver
The right thing to do here is to download the section of embl/genbank,
reformat to to Fasta file if you only want the sequence and want to save
space and then index with Bio::Index::Fasta or Bio::Index::Genbank or
whatever format you have decided on.
Then you will be able to pull sequences out to your hearts content. Spare
a thought for teh NCBI web servers - in no way should they try to honour a
request to pull out 100,000 sequences....
> thanks for any pointers,
> Neil
> --
> School of Biotechnology and Biomolecular Sciences,
> The University of New South Wales,
> Sydney 2052,
> Australia
>
> http://psychro.bioinformatics.unsw.edu.au/neil/index.php
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at bioperl.org
> http://bioperl.org/mailman/listinfo/bioperl-l
>
More information about the Bioperl-l
mailing list