[Bioperl-l] a problem when using the Bio::DB::Fasta

Florent Angly florent.angly at gmail.com
Tue Aug 24 05:06:21 UTC 2010


  Hi Guifeng,

 From the Bio::DB::Fasta documentation:
>        $db = Bio::DB::Fasta->new($fasta_path [,%options])
>          Create a new Bio::DB::Fasta object from the Fasta file or files
>          indicated by $fasta_path.  Indexing will be performed 
> automatically
>          if needed.  If successful, new() will return the database 
> accessor
>          object.  Otherwise it will return undef.

Hence, after you create the database object $db, you should check that 
it was successful, e.g.:
> my $db = Bio::DB::Fasta->new( '/home/wgf/elegans190.dna/' );
> if (not defined $db) {
>   die "There was a problem creating the database\n";
> }
A problem creating the database would explain the message you get.

If the extension of the FASTA files in the directory path that you gave 
as input is not fa, fasta, fast, FA, FASTA, FAST or dna, then you should 
use the -glob option when constructing your database object. From the 
documentation:
>           -glob         Glob expression to use    
> *.{fa,fasta,fast,FA,FASTA,FAST,dna}
>                         for searching for Fasta
>                              files in directories.


Florent



On 24/08/10 12:44, Guifeng Wei wrote:
> Hi,
>
> i came across a problem when i use the Bio::DB::Fasta modules of
> BioPerl. The aim i want to arrive at is to extract the subsequences
> accoording to the *.bed files which are the C.elegans genomic sequnece
> annotation.
>
> when i tried to run the scripts i wrote, the error message was coming, as
> follows:
>
> Can't call method "seq" on an undefined value at bed_to_fasta.pl line 28,
> <IN>  line 1.
>
> so, ask for favor to slove this problem.
> Here is my perl scripts.
>
> #!/usr/bin/perl -w
> # Purpose: extract sequences from genomic sequences
> use strict;
> use Bio::DB::Fasta;
> open(IN,$ARGV[0]) || die "sorry, the program cannot open the .bed file, plea
> check it. \n";
> my $db = Bio::DB::Fasta->new( '/home/wgf/elegans190.dna/' );
> # The dir ...../elegans190.dna/ includes 6
> files:chrI,chrII,chrIII,chrIV,chrV,chrX,
> #each stands for the sequences from the coressponding chromosome.
>
> while(<IN>){
>          chomp $_;
>          my @bed=split(/\s+/, $_ );
>
>          my $chr_id=$bed[0];
>          my $start=$bed[1];
>          my $end=$bed[2];
>          my $seq_name=$bed[3];
>          my $strand=$bed[5];
>
>          my $segment =  $db->seq( $chr_id, $start=>$end );
>
>          print ">",$seq_name,"_",$chr_id,":",$start=>$end;
>          print "$segment\n";
>
> }
>
> close(IN);
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l




More information about the Bioperl-l mailing list