[Bioperl-l] a problem when using the Bio::DB::Fasta

Chris Fields cjfields at illinois.edu
Tue Aug 24 13:38:45 UTC 2010


Guifeng,

Did you follow Jason's advice yesterday about converting the FASTA over to a more consistent length?  Or checking the database itself?  These are both things reiterated by Florent and Peter.

>From Jason's last response:

-------------------------
Wei -

Please ask your questions on the bioperl mailing list, I cannot answer questions directly for all requests.
Your problem has been answered by me on the list before so I urge you to use the list archives as a starting point.

The line lengths of the fasta file sequence aren't the same length.

you need to run this
bp_sreformat -if fasta -of fasta -i ORIGINAL -o NEW
mv NEW ORIGINAL

or with sreformat
sreformat fasta ORIGINAL > NEW
mv NEW ORIGINAL
-------------------------

chris


On Aug 24, 2010, at 6:28 AM, Guifeng Wei wrote:

> Hi,
> 
> i have revised my scripts according to the previous email from Florent.
> However, there were still some errors which frustrated me so much.
> 
> The errors are as follows:
> 
> ------------- EXCEPTION: Bio::Root::Exception -------------
> MSG: Each line of the fasta entry must be the same length except the last.
>   Line above #301451 '
> ..' is 22 != 51 chars.
> STACK: Error::throw
> STACK: Bio::Root::Root::throw
> /usr/lib/perl5/site_perl/5.8.8/Bio/Root/Root.pm:368
> STACK: Bio::DB::Fasta::calculate_offsets
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Fasta.pm:770
> STACK: Bio::DB::Fasta::index_dir
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Fasta.pm:593
> STACK: Bio::DB::Fasta::new
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Fasta.pm:488
> STACK: bed2fasta.pl:13
> -----------------------------------------------------------
> indexing was interrupted, so unlinking
> /home/wgf/elegans190.dna//directory.index at
> /usr/lib/perl5/site_perl/5.8.8/Bio/DB/Fasta.pm line 1053
> But in the directory /home/wgf/elegans190.dna/ , it concludes 6 files,
> each contains the complete sequences from one single chromosome, the format
> is fasta. The extension of the FASTA files is .fa. Every single file is
> started as ">chromosoemeXXX" followed by the thousands of sequences.
> 
> and therefore, it warn me that "Each line of the fasta entry must be the
> same length except the last". and "indexing was interrupted, so unlinking
> /home/wgf/elegans190.dna//directory".
> 
> i was much confused about this. so for help.
> 
> Wei Guifeng
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at lists.open-bio.org
> http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list