[Bioperl-l] FileCache.pm error

Marcelino Suzuki suzuki at cbl.umces.edu
Mon Jun 21 11:33:24 EDT 2004


	Thanks Jason.  That worked.

	I have another question. The script works well,  but I was wondering  
whether I can get the same CDS sequences in genbank format.  I was able  
to create a html file (using sed and awk) from a blast search  
containing links to al 400 such sequences from proteins I am working  
with, ie:

	http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi? 
val=34112904&itemID=36&view=gbwithparts
	
	and could get each sequence individually using the browser, but is  
there a way to batch those requests using bioperl?

	Thanks

	Marcelino
On Jun 21, 2004, at 1:22 AM, Jason Stajich wrote:

> Did you make the directory
> /tmp/cache
> on your machine?
>
> The FileCache stuff is overkill depending on what you want to.
>
> You can also leave it out by just saying:
>
> my $cachent = $ntdb;
> my $cachepep= $pepdb;
>
> -jason
> On Sun, 20 Jun 2004, Marcelino Suzuki wrote:
>
>> 	I am trying to run a script for getting CDS out of Genbank by Jason
>> Stajich below that I saved as test2.pl, and get the following error
>> message, that I believe is caused by my bioperl configuration (I just
>> installed bioperl in MacOS X:
>>
>> 	------------- EXCEPTION  -------------
>> MSG: Could not open primary index file
>> STACK Bio::DB::FileCache::_open_database
>> /Library/Perl/5.8.1/Bio/DB/FileCache.pm:321
>> STACK Bio::DB::FileCache::new
>> /Library/Perl/5.8.1/Bio/DB/FileCache.pm:127
>> STACK toplevel test2.pl:14
>>
>> 	Does anyone have any idea why I get this error?
>>
>> 	Thanks
>>
>> 	Marcelino
>>
>>
>> #!/usr/bin/perl -w
>> use strict;
>> use Bio::DB::GenBank;
>> use Bio::DB::GenPept;
>> use Bio::DB::FileCache;
>> use Bio::Factory::FTLocationFactory;
>> use Bio::SeqFeature::Generic;
>>
>> my $ntdb = new Bio::DB::GenBank;
>> my $pepdb= new Bio::DB::GenPept;
>>
>> # do some caching in the event you're pulling up the same
>> # chromosome and/or you are debugging
>> my $cachent = new Bio::DB::FileCache(-kept => 1,
>>                                       -file => '/tmp/cache/nt.idx',
>>                                       -seqdb => $ntdb);
>>
>> my $cachepep = new Bio::DB::FileCache(-kept => 1,
>>                                        -file => '/tmp/cache/pep.idx',
>>                                        -seqdb => $pepdb);
>>
>> # obj to turn strings into Bio::Location object
>> my $locfactory = new Bio::Factory::FTLocationFactory;
>>
>> # you might get these from a file (and they can be accessions too)
>> my @protgis = (10956263);
>>
>> foreach my $gi ( @protgis ) {
>>    my $protseq = $cachepep->get_Seq_by_id($gi);
>>    if( ! $protseq ) { print STDERR "could not find a seq for  
>> gi:$gi\n";
>>                       next;
>>                     }
>>    foreach my $cds (  grep { $_->primary_tag eq 'CDS' }
>>                            $protseq->get_SeqFeatures() )
>>    {
>>       next unless( $cds->has_tag('coded_by') ); # skip CDSes with no
>> coded_by
>>       my ($codedby) = $cds->each_tag_value('coded_by');
>>       my ($ntacc,$loc) = split(/\:/, $codedby);
>>       $ntacc =~ s/(\.\d+)//; # genbank wants an accession not a
>> versioned one
>>       my $cdslocation = $locfactory->from_string($loc);
>>       my $cdsfeature = new Bio::SeqFeature::Generic(-location =>
>> $cdslocation);
>>       my $ntseq = $cachent->get_Seq_by_acc($ntacc);
>>       next unless $ntseq;
>>       $ntseq->add_SeqFeature($cdsfeature); # locate the feature on a  
>> seq
>>       my $cdsseq = $cdsfeature->spliced_seq();
>>       print "cds seq is ", $cdsseq->seq(), "\n";
>>   }
>> }
>>
>>
>>
>> ====================================================================== 
>> ==
>> ====
>>              oOOOOo           			Marcelino Suzuki,  Assistant  
>> Professor
>>            oOOO            Chesapeake Biological Lab - Univ of  
>> Maryland
>> Center Environm Science
>>         oOOOOOo.          		PO Box 38, One Williams St Solomons, MD  
>> 20688
>>      .oOOOOOOOOOo.                      suzuki at cbl.umces.edu  -
>> http://cbl.umces.edu
>>    .oOOOOOOOOOOOOOOooo..    	 Ph 410-326-7291   FAX 410-326-7341
>> 0000000000000000000000000000000000000000000000000000000000000000000000 
>> 00
>> 0000
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at portal.open-bio.org
>> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>>
>
> --
> Jason Stajich
> Duke University
> jason at cgt.mc.duke.edu
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
>
======================================================================== 
====
             oOOOOo           			Marcelino Suzuki,  Assistant Professor
           oOOO            Chesapeake Biological Lab - Univ of Maryland  
Center Environm Science
        oOOOOOo.          		PO Box 38, One Williams St Solomons, MD 20688
     .oOOOOOOOOOo.                      suzuki at cbl.umces.edu  -   
http://cbl.umces.edu
   .oOOOOOOOOOOOOOOooo..    	 Ph 410-326-7291   FAX 410-326-7341
000000000000000000000000000000000000000000000000000000000000000000000000 
0000



More information about the Bioperl-l mailing list