[Bioperl-l] Re: load_gff.pl question
Scott Cain
cain at cshl.org
Wed Aug 6 15:21:17 EDT 2003
Shin,
The problem you are running into is not really with load_gff.pl, but
with the database schema. Assuming you are using MySQL, the table
create statement for fdata looks like this:
create table fdata (
fid int not null auto_increment,
fref varchar(100) not null,
fstart int unsigned not null,
fstop int unsigned not null,
fbin double(20,6) not null,
ftypeid int not null,
fscore float,
fstrand enum('+','-'),
fphase enum('0','1','2'),
gid int not null,
ftarget_start int unsigned,
ftarget_stop int unsigned,
primary key(fid),
unique index(fref,fbin,fstart,fstop,ftypeid,gid),
index(ftypeid),
index(gid)
The problem you have is with that unique index on
(fref,fbin,fstart,fstop,ftypeid,gid). This index conflicts with your
data, in that the similar lines are getting assigned the same gid (group
id), since they look like the same thing. So, the quick way to fix this
is to remove the 'unique' from the index declaration. That can be found
in Bio/DB/GFF/Adaptor/dbi/mysql.pm. Then run load_gff.pl as usual. The
longer way to fix this is look at your data and figure out why they are
all getting assigned the same group id and make them sufficiently
different so that they don't.
Hope that helps,
Scott
On Wed, 2003-08-06 at 13:31, bioperl-l-request at portal.open-bio.org
wrote:
> Where do I start to customize this script to allow loading of large
> number of similar entities?
--
------------------------------------------------------------------------
Scott Cain, Ph. D. cain at cshl.org
GMOD Coordinator (http://www.gmod.org/) 216-392-3087
Cold Spring Harbor Laboratory
More information about the Bioperl-l
mailing list