[Bioperl-l] Can't figure out restriction analysis
Maxim
deeepersound at googlemail.com
Fri Jan 7 21:46:15 UTC 2011
Hi,
I'm desperately trying to annotate all HindIII restriction sites for
different genomes. The only solution I was able to figure out (after
complicated self-made approaches using awk and grep) is shown below. I'm not
a programmer, so please excuse the bad coding (strict, warnings etc, this is
dedicated to be a "standalone" script). Below code reads the 25 fasta files
for human genome one after another. Then restriction analysis is performed,
the only value I was able to get returned so far is the fragment length. Am
I right, that the fragment length values returned by the @fragments array
can be mapped onto the reference genome in a linear fashion, i.e. the
fragment length values are ordered according to their location on the
reference sequence? If so, I can of course get the positions (coordinates)
by simply adding the fragment length values.
More questions:
I've seen the _make_cuts function, would this be more appropriate in order
to directly retrieve coordinates of cutting positions? Unfortunately I can't
get it work like
my $analysis = _make_cuts($seq->seq,'HindIII',0)
this returns:can't call method "_make_cuts" on an undefined value
Or are both approaches just nonsense and I overlooked another obvious
function?
Maxim
use Bio::Perl;
use Bio::SeqIO;
use Bio::Restriction::Analysis;
use Bio::Restriction::EnzymeCollection;
use Cwd;
$dir = getcwd;
$fasta_dir = "/data1/Genomes/HG18/";
opendir(DIR, $fasta_dir) or die "$!";
@chroms = grep {/chr/} readdir DIR;
print "found the following files:", join ("\n", map {$_} @chroms), "\n";
#foreach $chrom (@chroms) {print "$chrom\n";}
foreach $file (@chroms)
{
$outname = $file;
@outname = split (/.fa/, $outname);
$outname = $dir . "/" . @outname[0] . "_fragments";
print "Output filename: $outname\n";
#exit;
open (OUT, ">$outname");
$filelink = $fasta_dir. "/" . $file;
my $seqio = Bio::SeqIO->new(-file => $filelink, '-format' => 'Fasta');
while(my $seq = $seqio->next_seq)
{
my $string = $seq->seq;
#print $string,"\n";
my $analysis = Bio::Restriction::Analysis->new(-seq => $seq);
my @fragments = $analysis->fragments('HindIII');
print OUT join("\n", map {length $_} @fragments), "\n";
}
close (OUT);
print "chromosome file: $file is done!\n";
}
More information about the Bioperl-l
mailing list