[Bioperl-l] HOWTO Beginners: formatdb program/indexing adatabasesequence file?

Ryan Golhar golharam at umdnj.edu
Mon Oct 17 21:34:34 EDT 2005


NCBI is pretty good at answering questions.  Try emailing NCBI.  I
forget their email address, but they give it on the web page.

Also, I've built an RPM for the NCBI toolkit available at
http://serine.umdnj.edu/~golharam/biorpms that installs the toolkit in
/usr/local/ncbi.  It places a shell script in /etc/profile.d to
automatically set necessary environment variables.  You can download the
RPM and install it, then put your database files in /usr/local/ncbi/db.
That should be it.

Ryan


-----Original Message-----
From: bioperl-l-bounces at portal.open-bio.org
[mailto:bioperl-l-bounces at portal.open-bio.org] On Behalf Of Barry Moore
Sent: Monday, October 17, 2005 7:01 PM
To: Sam Al-Droubi; bioperl-l at portal.open-bio.org
Subject: RE: [Bioperl-l] HOWTO Beginners: formatdb program/indexing
adatabasesequence file?


Sam-

I appreciate your plight.  I've been there myself many times.  If wish I
could offer more help, but I don't really know what the answer to your
problem is.  The reason I am trying to keep these questions on the
bioperl list and on the bioperl topic is that those of us answering
questions on this list have to balance our time between answering
questions and getting our own work done.  I actually tried to recreate
your error message on my system, and the only way I could do it was to
remove the read permissions on the file.  I can see from your directory
listing that that isn't your problem.  Here are some things to try:

Try specifying the full path to your ecoli.nt in your -d flag Set the
environment variable BLASTDB to point to your db directory If you have
root access change the ownership of those files to you.  If you don't
have root access talk to your SA. Write to the NCBI help desk (they're
sometimes slow, but they should
answer)
Try some of the bioinformatics forums at bioinformatics.org  They are
more general and might be able to help.

Barry

> -----Original Message-----
> From: Sam Al-Droubi [mailto:saldroubi at yahoo.com]
> Sent: Monday, October 17, 2005 4:00 PM
> To: Barry Moore
> Subject: RE: [Bioperl-l] HOWTO Beginners: formatdb program/indexing a 
> databasesequence file?
> 
> 
> Barry,
> 
> I did check the file and it is there and the
> permissions are read for all and I even made them
> owned by oracle (the unix user).  The reason I
> installed blast is to use it via bioperl.  I don't
> know who else I can turn to for help.  I tried google
> but it didn't come up with anything useful.  So I
> really appreciate any help.
> 
> 
> The directory list is below: m1:/usr/src/blast/blast-2.2.12/data # ls 
> -l total 6707
> drwxr-xr-x  2   3755  5333    1096 Oct 17 22:57 .
> drwxr-xr-x  5   3755  5333     144 Aug 28 10:38 ..
> -rw-r--r--  1 oracle users    2122 Aug 28 10:38
> BLOSUM45
> -rw-r--r--  1 oracle users    2122 Aug 28 10:38
> BLOSUM62
> -rw-r--r--  1 oracle users    2124 Aug 28 10:38
> BLOSUM80
> -rw-r--r--  1 oracle users    1521 Aug 28 10:38
> KSat.flt
> -rw-r--r--  1 oracle users     736 Aug 28 10:38
> KSchoth.flt
> -rw-r--r--  1 oracle users    1521 Aug 28 10:38
> KSgc.flt
> -rw-r--r--  1 oracle users     852 Aug 28 10:38
> KShopp.flt
> -rw-r--r--  1 oracle users     940 Aug 28 10:38
> KSkyte.flt
> -rw-r--r--  1 oracle users    1320 Aug 28 10:38
> KSpcc.mat
> -rw-r--r--  1 oracle users    1521 Aug 28 10:38
> KSpur.flt
> -rw-r--r--  1 oracle users    1521 Aug 28 10:38
> KSpyr.flt
> -rw-r--r--  1 oracle users    2666 Aug 28 10:38 PAM30
> -rw-r--r--  1 oracle users    2666 Aug 28 10:38 PAM70
> -rw-r--r--  1 oracle users   72720 Aug 28 10:38
> asn2ff.prt
> -rw-r--r--  1 oracle users  148763 Aug 28 10:38
> bstdt.val
> -rw-r--r--  1 oracle users 4763013 Oct 17 22:55
> ecoli.nt
> -rw-r--r--  1 oracle users   60292 Oct 17 22:57
> ecoli.nt.nhr
> -rw-r--r--  1 oracle users    4876 Oct 17 22:57
> ecoli.nt.nin
> -rw-r--r--  1 oracle users    3200 Oct 17 22:57
> ecoli.nt.nnd
> -rw-r--r--  1 oracle users      60 Oct 17 22:57
> ecoli.nt.nni
> -rw-r--r--  1 oracle users   58320 Oct 17 22:57
> ecoli.nt.nsd
> -rw-r--r--  1 oracle users    1264 Oct 17 22:57
> ecoli.nt.nsi
> -rw-r--r--  1 oracle users 1165813 Oct 17 22:57
> ecoli.nt.nsq
> -rw-r--r--  1 oracle users    7708 Aug 28 10:38
> featdef.val
> -rw-r--r--  1 oracle users     173 Oct 17 22:57
> formatdb.log
> -rw-r--r--  1 oracle users    3297 Aug 28 10:38 gc.val
> -rw-r--r--  1 oracle users   50078 Aug 28 10:38
> humrep.fsa
> -rw-r--r--  1 oracle users  109993 Aug 28 10:38
> lineages.txt
> -rw-r--r--  1 oracle users   64663 Aug 28 10:38
> makerpt.prt
> -rw-r--r--  1 oracle users   49810 Aug 28 10:38
> objprt.prt
> -rw-r--r--  1 oracle users    1112 Aug 28 10:38
> pubkey.enc
> -rw-r--r--  1 oracle users    6034 Aug 28 10:38
> seqcode.val
> -rw-r--r--  1 oracle users  154962 Aug 28 10:38
> sequin.hlp
> -rw-r--r--  1 oracle users     480 Aug 28 10:38
> sgmlbb.ent
> -rw-r--r--  1 oracle users   33461 Aug 28 10:38
> taxlist.txt
> 
> 
> 
> --- Barry Moore <bmoore at genetics.utah.edu> wrote:
> 
> > Sam-
> >
> > We're getting a bit off topic here for the bioperl
> > list.  You've checked
> > that ecoli.nt.nin is there and has the correct
> > permissions.
> >
> > Barry
> >
> > > -----Original Message-----
> > > From: Sam Al-Droubi [mailto:saldroubi at yahoo.com]
> > > Sent: Monday, October 17, 2005 3:38 PM
> > > To: Barry Moore
> > > Subject: RE: [Bioperl-l] HOWTO Beginners: formatdb
> > program/indexing a
> > > databasesequence file?
> > >
> > > Hi Barry,
> > >
> > > Thank you very very much for this info.  I
> > downloaded
> > > blast and installed it following the directions.
> > >
> > > I am trying to run the ecoli test as suggested but
> > it
> > > does not seem that blastall knows where the data directory is even

> > > though it is specified in the .ncbirc file as follows:
> > >
> > > [NCBI]
> > > Data="/usr/src/blast/blast-2.2.12/data/"
> > >
> > > I am getting this error:
> > >
> > > blastall -p blastn -d ecoli.nt -i test.txt -o
> > test.out
> > > [blastall] WARNING: Test: Unable to open
> > >
> > > Any idea what could be wrong?
> > >
> > > Thank you again.
> > >
> > > --- Barry Moore <bmoore at genetics.utah.edu> wrote:
> > >
> > > > Sam-
> > > >
> > > > You can run blast on your own computer.
> > formatdb is
> > > > a program that
> > > > comes with the blast distribution that formats sequences in a 
> > > > fasta file into a blast database for use in blast searches.
> >  If
> > > > you have installed
> > > > BLAST locally, then you have formatdb.
> > > > Documentation for formatdb is
> > > > found with man formatdb on a linux system, or
> > here
> > > > for an online source: ftp://ftp.ncbi.nih.gov/blast/documents/
> > > >
> > > > The db.fa file is any fasta file that you want
> > to
> > > > create a blast
> > > > database out of with formatdb.
> > > >
> > > > A couple of your questions suggest that you
> > would
> > > > greatly benefit from
> > > > exploring the UCSC genome browser at http://genome.ucsc.edu/.  
> > > > In this case you have a gene sequence that you want to
> > align
> > > > to a genome.  BLAST
> > > > will get you started, but you're better off with
> > > > BLAT for that job.  If
> > > > you're doing this for one of the many genomes
> > > > covered in the UCSC genome
> > > > browser, then they've probably already done this
> > for
> > > > you.  If you're
> > > > doing it for something new or for a lot of genes
> > > > then you can run BLAT
> > > > locally, and Bioperl can help you with the that:
> > > >
> > > > Bio::Tools::Run::Alignment::Blat.html
> > > > Bio::Tools::Blat.html
> > > >
> > > > Barry
> > > >
> > > > -----Original Message-----
> > > > From: bioperl-l-bounces at portal.open-bio.org
> > > > [mailto:bioperl-l-bounces at portal.open-bio.org]
> > On
> > > > Behalf Of Sam
> > > > Al-Droubi
> > > > Sent: Monday, October 17, 2005 1:41 PM
> > > > To: bioperl-l at portal.open-bio.org
> > > > Subject: [Bioperl-l] HOWTO Beginners: formatdb program/indexing 
> > > > a databasesequence file?
> > > >
> > > > All,
> > > >
> > > > Under the BLAST section, the Beginners HOWTO 
> > > > (http://bioperl.org/HOWTOs/html/Beginners.html
> > )
> > > > says
> > > >
> > > > "The example code assumes that you used the
> > formatdb
> > > > program to index
> > > > the database sequence file "db.fa"."
> > > >
> > > > Could someone tell me who to go about creating
> > the
> > > > db.fa file.  looked
> > > > on the webiste but there was no reference to a
> > > > formatdb program.
> > > >
> > > > My goal is:  I have a gene sequence and I want
> > to
> > > > align it again the
> > > > chromosome sequence, both are fasta format.  Can
> > I
> > > > do this with the
> > > > blast program on my own computer?
> > > >
> > > >
> > > > Thank you in advance.
> > > >
> > > >
> > > >
> > > > Sincerely,
> > > > Sam Al-Droubi, M.S.
> > > > saldroubi at yahoo.com 
> > > > _______________________________________________
> > > > Bioperl-l mailing list
> > > > Bioperl-l at portal.open-bio.org
> > > >
> > >
> >
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
> > > >
> > >
> > >
> > > Sincerely,
> > > Sam Al-Droubi, M.S.
> > > saldroubi at yahoo.com
> >
> 
> 
> Sincerely,
> Sam Al-Droubi, M.S.
> saldroubi at yahoo.com

_______________________________________________
Bioperl-l mailing list
Bioperl-l at portal.open-bio.org
http://portal.open-bio.org/mailman/listinfo/bioperl-l



More information about the Bioperl-l mailing list