[Bioperl-l] Extract contigs
Antony03
antony.vincent.1 at ulaval.ca
Sun Apr 27 17:04:49 UTC 2014
Hi,
I wrote this little code:
#!/usr/bin/perl
#By Antony Vincent#
use strict;
use warnings;
use diagnostics;
use Bio::Perl;
use Bio::SeqIO;
use IO::String;
use Bio::SearchIO;
use Getopt::Long;
my $filename;
my $help;
GetOptions(
'file=s' => \$filename,
'help!' => \$help,
) or die "Incorrect usage! Try perl new_db.pl -help for an exhaustif
help.\n";
if( $help ) {
print " **********\n";
print " ***HELP***\n";
print " **********\n\n";
print "One option is required:\n\n";
print " -file: Your file in multi-fasta\n\n";
exit;
}
my @taxa_name;
open(FILE, "<", "test");
while(<FILE>) {
chomp;
push(@taxa_name, $_);
}
close(FILE);
print @taxa_name;
mkdir 'new_db';
my $gb = Bio::SeqIO->new(-file => "<$filename",
-format => "fasta");
my $fa = Bio::SeqIO->new(-file => ">new_db/$filename",
-format => "fasta",
-flush => 0);
SEQ:
while (my $seq = $gb->next_seq) {
my $id_and_desc = $seq->id . " " . $seq->desc;
foreach my $str (@taxa_name) {
if ($id_and_desc =~ /\Q$str\E/) {
$fa->write_seq($seq);
next SEQ;
}
}
}
It allows to extract contigs from a multi-fasta file. The problem is when I
try to extract the contig-1, it extracts contig-1, contig-10, contig-11 ....
How can I change my code for extract only contigs with exact names.
Thanks
--
View this message in context: http://bioperl.996286.n3.nabble.com/Extract-contigs-tp17469.html
Sent from the Bioperl-L mailing list archive at Nabble.com.
More information about the Bioperl-l
mailing list