[Bioperl-l] Extracting gi no from refseq record
Siddhartha Basu
basu at pharm.sunysb.edu
Thu Apr 3 18:13:17 EST 2003
Hi hilmar,
Hilmar Lapp wrote:
> RefSeq does feature this line, at least the last time I checked. We also
> do test for this being parsed correctly, namely in t/SeqIO.t. It's not
> always as bad as it sometimes seems. I ran an entire download of RefSeq
> a couple weeks ago and the GI# was parsed out of every single record.
>
> Siddharta, do you run at least bioperl 1.2?
Yes i am running bioperl 1.2
If you do, can you run
>
> $ make test_SeqIO
>
> from the root directory of the bioperl distribution? Can you please mail
> the output?
No tests failed it seems. Here is the output.
==========================================================================
PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib
-I/usr/lib/perl5/5.8.0/i386-linux-thread-multi -I/usr/lib/perl5/5.8.0 -e
'use Test::Harness qw(&runtests $verbose); $verbose=0; runtests @ARGV;'
t/SeqIO.t
t/SeqIO....ok
3/146 skipped:
All tests successful, 3 subtests skipped.
Files=1, Tests=146, 1 wallclock secs ( 0.88 cusr + 0.03 csys = 0.91 CPU)
===========================================================================
If a test fails, could you please run
>
> $ make test_SeqIO TEST_VERBOSE=1
>
> and mail the output for the failed test(s)?
>
> If no test fails, could you email the accession# of the sequence you
> retrieved and for which there was no GI#?
The accession no i have tried is NP_031416.
This is a mouse refseq NP no. which i have choosen randomly from mouse
refseq flat files. I have downloaded the refseq flat files that ends
with gbff and gnp and then used the following code to index all of them.
=============================================================================
#!/usr/bin/perl -w
use strict;
use Bio::Index::GenBank;
my $IdFile = "$ENV{BIOPERL_INDEX}/genebankall";
my $Idx = Bio::Index::GenBank->new( -filename => $IdFile, -write_flag=>
'WRITE');
my $Folder = "/home/basu/scriptanalysis/Genepept";
opendir(FL,$Folder) || die "Can't open folder:$!";
my @Files = grep (!/^\.\.?$/,readdir(FL));
my @FullPath = ();
foreach (@Files) { push (@FullPath, "$Folder/$_"); }
$Idx->make_index(@FullPath);
print "Indexed successfully\n";
closedir(FL);
exit;
=================================================================================
Then used a simple script for fetching ......
==================================================================================
#!/usr/bin/perl -w
#
use Bio::Index::GenBank;
use strict;
my $GenIndName = "$ENV{BIOPERL_INDEX}/genebankall";
my $Idx = Bio::Index::GenBank->new(-filename => $GenIndName);
my $Seq = $Idx->get_Seq_by_acc('NP_031416');
print $Seq->primary_id(),"\n";
exit;
===================================================================================
The output is Chrna7. My wish is to get the gi no that is 6671503.
So how do i get that value.
-siddhartha
>
> -hilmar
>
More information about the Bioperl-l
mailing list