[Bioperl-l] Hit length using length_aln()
Ken
kjgraham@ucdavis.edu
Mon, 15 Jul 2002 13:29:12 -0700
A little background on what I'm trying to do will make things clearer.
Fortunately for this specific application I am working with bacterial genomes
(2 different strains of the same species) but I know that I'll want to keep
everything as general as possible (Human genome applications are right around
the corner).
The overall goal is to design PCR primers that will amplify a gene in both
stains (if it is present in both strains). And not amplify other genes.
I have results from one strain BLASTed against itself and the other strain
for every gene. Obviously the first hit in each report is the gene itself.
Then the same gene if it is present in the other strain. And finally other
genes in either strain that produce a hit.
I need to know how closely, and where, the other strain matches on a gene by
gene basis.
I've got the basic code working, more or less in this order; design primers
for gene in species A, read in BLAST report, find mismatches in species B,
find matches on other genes (not to amplify), list primers that are in both
species but not in other genes. This works if everthing is simple.
But now I'm working on the gory details such as, a hit (possibly the gene in
question) in the other species that has gaps reported as separate HSPs. I
want to treat the individual HSPs as one entity for purposes of this
application.
In response to Brian's post regarding if I want the entire hit or the
sequence that matches the query. For well annontated bacterial genomes I can
work with the hit. However, your point about a hit potentially being 5 Mb is
right. I'm not at that point yet but I probably will be sooner than I expect.
We're starting to look at some human gene families and I'm just now getting
used to the annotations and biology of the human genome (I'm used to S.
cervisiae and bacteria).
So I guess my question is, I want to find the easiest/best way to work with
the HSPs of a hit as a single entity. What objects would you recommend I use?
Sorry for the length of this post.
Thanks again,
Ken