[Bioperl-l] Mapping Coordinates

Navin Elango navin_elango at yahoo.co.uk
Tue Apr 26 20:32:27 EDT 2005


 Hi All,

I am trying to map coordinates in the query of a blast
alignment to that of the subject. I am using
Bio::Coordinate::utils. I am having problems when the
match is in the minus strand of the subject. I used
blast2seq in NCBI to blast the following 2 sequences

sequence 1(query)
AAAAAcaaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctcaagttcatagttccaTTTTTTTTTT

sequence2(subject)
TTTTTTTTTTTTTTTtggaactatgaacttgagagagatgatctgaaattggaatttaggtttaaaagggaagcagagcataaaagtttgAAAAAAAAAA

The lower case letters in the second sequence is the
reverse complement of the lower case letters in the
first sequence. 

The BLAST output is as follows. Note, I have given
some dummy name to the subject.
=======================================
BLASTN 2.2.10 [Oct-19-2004]


Reference: Altschul, Stephen F., Thomas L. Madden,
Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of
protein database search
programs",  Nucleic Acids Res. 25:3389-3402.
RID: 1114363842-27559-121522116853.BLASTQ2
Query= 
         (90 letters)

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no
EST, STS,
GSS,environmental samples or phase 0, 1 or 2 HTGS
sequences) 
           3,067,237 sequences; 13,823,778,485 total
letters



                                                      
          Score    E
Sequences producing significant alignments:           
          (bits) Value

ref|NT_005120.15|Hs2_5277 Homo sapiens chromosome 2
genomic contig   2.103e+04   0.0  


ALIGNMENTS
>ref|NT_005120.15|Hs2_5277 Homo sapiens chromosome 2
genomic contig
          Length = 100
                                                      
               
Score =  144 bits (75), Expect = 0.0
Identities = 75/75 (100%)
 Strand = Plus / Minus
                                                      
   
Query: 6 
caaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctc
65
         
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct: 90
caaacttttatgctctgcttcccttttaaacctaaattccaatttcagatcatctctctc
31

                         
Query: 66 aagttcatagttcca 80
          |||||||||||||||
Sbjct: 30 aagttcatagttcca 16


CPU time:     0.01 user secs.       0.01 sys. secs    
     0.02 total secs.

Lambda     K      H
    1.33    0.621     1.12 

Gapped
Lambda     K      H
    1.33    0.621     1.12 

===========================================


I want the region in the subject sequence which
corresponds to region 10 - 20 in the query sequence.
When I mapped 10 and 20 , using the coordinate mapper
(code given below), I get 20 and 30 respectively,
which seems to be wrong.

How do I get the region in suject corresponding to
region 10-20 ie cttttatgctc in query.

The code I am using is as follows

#building a mapper object, with the query sequence as
the reference sequence.
	my $mapper =
Bio::Coordinate::utils->from_align($hspAlignment,1);
	
	my $startPositionObject = Bio::Location::Simple->new
(-start => $queryStartPosition, -end =>
$queryStartPosition );
  	my $res = $mapper->map($startPositionObject);
  	
  
  	my @returnArray;
  	$subjectStart = $res->match->start;
 

  	
  	my $endPositionObject = Bio::Location::Simple->new
(-start => $queryEndPosition, -end =>
$queryEndPosition );
  	
  	my $endRes = $mapper->map($endPositionObject);
  	$subjectEnd = $endRes->match->start;
    	


Please let me know if there is any other method to
approach this problem. I see that there is a function
for getting the position in the alignment based on
position in the sequence. Is there a function to do
the vice versa ?

Thanks,
Navin.




Send instant messages to your online friends http://uk.messenger.yahoo.com 


More information about the Bioperl-l mailing list