[Bioperl-l] NCBI's seq_gene.md file

Tue Mar 7 04:24:01 UTC 2006

Hello, Ryan,

I wrote a script to load the .md into MySQL a couple of years ago. Hopefully it still works.

Wenwu Cui, PhD
NCI/NIH

#!/usr/bin/perl

use strict;
use warnings;

# Make connection with MySQL database

use DBI;

my $database = 'hsgenome';
my $server = 'localhost';           #your server IP
my $user = 'root';                  #your username
my $passwd = 'mysql';               #your password

my $hsgenome = DBI->connect("dbi:mysql:$database:$server", $user, $passwd)
    or exit (1);

# prepare an SQL statement
#create genedb table =>task1

my $task1 =qq/ create table genedb  
    (
     taxid int(5),   
     chromosome  char(3),     
     chrStart  int(10),       
     chrEnd  int(10),
     orientation  char(2),   
     contig char(15),
     cnt_start int,
     cnt_stop  int,
     cnt_orient  char(2),
     featureName char(10),
     featureId   char(15),
     featureType char(10),    
     groupLabel  char(10),    
     transcript  char(10),    
     weight      char(2) 
     )
    /;

$hsgenome->do ($task1);

#load gene_seq data to genedb

# datafile directory should be changed according to your path to your .md file
$hsgenome->do ("LOAD DATA local infile '/root/seq_gene.md' into table genedb ignore 1 lines") or die "could not load";

# Break connection with MySQL database

$hsgenome->disconnect;

exit;

________________________________

From: Ryan Golhar [mailto:golharam at umdnj.edu]
Sent: Mon 3/6/2006 9:58 AM
To: 'bioperl-l'
Subject: [Bioperl-l] NCBI's seq_gene.md file

There is a file NCBI has for every organism called seq_gene.md.  It
contains a list of the all the Genes names, chromosome locations, exons,
introns, protein, strand, contig, etc.

I can parse this easily, but was wondering if there is a bioperl module
for this?

Ryan

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l