[Bioperl-l] Calculating a bunch of SNPs
Albert Vilella
avilella at ub.edu
Thu Jan 19 18:31:01 UTC 2006
El dj 19 de 01 del 2006 a les 12:15 -0500, en/na Amir Karger va
escriure:
> I have 96 files. The first is a reference sequence. The other 95 are
> sequences from different genotypes, with minor SNPs compared to the first
> one. I want to generate a list of all the SNPs for each sequence compared to
> the reference sequence. Output format doesn't really matter.
Dear Amir,
If the sequences are simply instances of genotypes/haplotypes, so that
each position already correlates in all 96 sequences, then one
possibility would be to simply create a Bio::Align object by adding
each of them.
Once you have your alignment, you can get the marker information with
the aln_to_population method of Bio::PopGen::Utilities.
Usage : my $pop = Bio::PopGen::Utilities->aln_to_population($aln);
Function: Turn and alignment into a set of L<Bio::PopGen::Individual>
objects grouped in a L<Bio::PopGen::Population> object
You will see some example output files in t/data/.
There may be other (better or different) ways to do what you need with
Bioperl,
Albert.
> I was told I could run EMBOSS diffseq on each of the 95 pairs, and parse the
> output to get my list. I'm wondering if there's a Bioperl tool that will do
> what diffseq does, though - presumably outputting Bio::Align objects of some
> kind, or is it Bio::Variation? - rather than parsing 95*N output files.
More information about the Bioperl-l
mailing list