[Bioperl-l] Mapping Coordinates
Navin Elango
navin_elango at yahoo.co.uk
Tue Apr 26 20:32:27 EDT 2005
Hi All,
I am trying to map coordinates in the query of a blast
alignment to that of the subject. I am using
Bio::Coordinate::utils. I am having problems when the
match is in the minus strand of the subject. I used
blast2seq in NCBI to blast the following 2 sequences
sequence 1(query)
AAAAAcaaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctcaagttcatagttccaTTTTTTTTTT
sequence2(subject)
TTTTTTTTTTTTTTTtggaactatgaacttgagagagatgatctgaaattggaatttaggtttaaaagggaagcagagcataaaagtttgAAAAAAAAAA
The lower case letters in the second sequence is the
reverse complement of the lower case letters in the
first sequence.
The BLAST output is as follows. Note, I have given
some dummy name to the subject.
=======================================
BLASTN 2.2.10 [Oct-19-2004]
Reference: Altschul, Stephen F., Thomas L. Madden,
Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of
protein database search
programs", Nucleic Acids Res. 25:3389-3402.
RID: 1114363842-27559-121522116853.BLASTQ2
Query=
(90 letters)
Database: All GenBank+EMBL+DDBJ+PDB sequences (but no
EST, STS,
GSS,environmental samples or phase 0, 1 or 2 HTGS
sequences)
3,067,237 sequences; 13,823,778,485 total
letters
Score E
Sequences producing significant alignments:
(bits) Value
ref|NT_005120.15|Hs2_5277 Homo sapiens chromosome 2
genomic contig 2.103e+04 0.0
ALIGNMENTS
>ref|NT_005120.15|Hs2_5277 Homo sapiens chromosome 2
genomic contig
Length = 100
Score = 144 bits (75), Expect = 0.0
Identities = 75/75 (100%)
Strand = Plus / Minus
Query: 6
caaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctc
65
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 90
caaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctc
31
Query: 66 aagttcatagttcca 80
|||||||||||||||
Sbjct: 30 aagttcatagttcca 16
CPU time: 0.01 user secs. 0.01 sys. secs
0.02 total secs.
Lambda K H
1.33 0.621 1.12
Gapped
Lambda K H
1.33 0.621 1.12
===========================================
I want the region in the subject sequence which
corresponds to region 10 - 20 in the query sequence.
When I mapped 10 and 20 , using the coordinate mapper
(code given below), I get 20 and 30 respectively,
which seems to be wrong.
How do I get the region in suject corresponding to
region 10-20 ie cttttatgctc in query.
The code I am using is as follows
#building a mapper object, with the query sequence as
the reference sequence.
my $mapper =
Bio::Coordinate::utils->from_align($hspAlignment,1);
my $startPositionObject = Bio::Location::Simple->new
(-start => $queryStartPosition, -end =>
$queryStartPosition );
my $res = $mapper->map($startPositionObject);
my @returnArray;
$subjectStart = $res->match->start;
my $endPositionObject = Bio::Location::Simple->new
(-start => $queryEndPosition, -end =>
$queryEndPosition );
my $endRes = $mapper->map($endPositionObject);
$subjectEnd = $endRes->match->start;
Please let me know if there is any other method to
approach this problem. I see that there is a function
for getting the position in the alignment based on
position in the sequence. Is there a function to do
the vice versa ?
Thanks,
Navin.
Send instant messages to your online friends http://uk.messenger.yahoo.com
More information about the Bioperl-l
mailing list