[Bioperl-l] Alignment from blast report

Mark A. Jensen maj at fortinbras.us
Tue Mar 2 14:51:12 UTC 2010


This might a good method to have for Bio::Search::Tiling--
you want to stitch together all the hsps and have the
concatenated alignment returned as a Bio::SimpleAlign,
correct? Tiling would create the right set of hsps from
which to generate the composite alignment. I can
try to get something working, but it may take a while-
MAJ
----- Original Message ----- 
From: "Paolo Pavan" <paolo.pavan at gmail.com>
To: "Chris Fields" <cjfields at illinois.edu>
Cc: <bioperl-l at lists.open-bio.org>
Sent: Tuesday, March 02, 2010 9:37 AM
Subject: Re: [Bioperl-l] Alignment from blast report


Hi Chris,
Thank you for your reply. So I have to understand that since the
get_aln method returns the HSP alignment, there is no way to retrieve
the whole alignment as in the example pasted, isn't it?
Basically I'm trying to use megablast as kind of multiple local
alignment engine and actually I'm not pretty sure this is a good idea
but in my particular case could be suitable. I mean that the example
below reports only the portions of the sequences that align loosing
the portions that does not, I'm not sure I gave the idea. What do you
think about? Can you give me your opinion?
If there isn't any module written yet, I can try to write a parser, it
could be of any interest?

Thank you,
Paolo

2010/3/2 Chris Fields <cjfields at illinois.edu>:
> Paolo,
>
> You can get a Bio::SimpleAlign from the HSP object. The first code example in 
> this section in the HOWTO demonstrates this:
>
> http://www.bioperl.org/wiki/HOWTO:SearchIO#Using_the_methods
>
> chris
>
> On Mar 1, 2010, at 5:07 PM, Paolo Pavan wrote:
>
>> Dear all,
>> Sorry for pushing up my post but, please does anyone have an hint for me?
>> Maybe have I to send attached the report to the mailing list? I don't
>> know attachment policies of the list, if it is allowed and is needed I
>> can do that.
>>
>> Thank you,
>> Paolo
>>
>> 2010/2/26 Paolo Pavan <paolo.pavan at gmail.com>:
>>> Sorry,
>>> Maybe I forgot to add this is the megablast -m 5 output.
>>>
>>> Thank you again,
>>> Paolo
>>>
>>> 2010/2/26 Paolo Pavan <paolo.pavan at gmail.com>:
>>>> Hi all,
>>>> I have just a brief question: I've got some megablast reports such the
>>>> one I've pasted below.
>>>> I'm aware of the existence of the Bio::Search::IO::megablast and the
>>>> Bio::Search::HSP::BlastHSP::get_aln but, is there a way to get the
>>>> entire alignment represented as a Bio::SimpleAlign object or
>>>> Bio::Align::AlignI implementing one?
>>>>
>>>> Thank you all,
>>>> Paolo
>>>>
>>>>
>>>> MEGABLAST 2.2.16 [Mar-25-2007]
>>>>
>>>>
>>>> Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller 
>>>> (2000),
>>>> "A greedy algorithm for aligning DNA sequences",
>>>> J Comput Biol 2000; 7(1-2):203-14.
>>>>
>>>> Database: 00038-00053.fasta
>>>> 2 sequences; 2001 total letters
>>>>
>>>> Searching..................................................done
>>>>
>>>> Query= 00038-00053
>>>> (802 letters)
>>>>
>>>>
>>>>
>>>> Score E
>>>> Sequences producing significant alignments: (bits) Value
>>>>
>>>> ______00038
>>>> 226 1e-62
>>>> ______00053
>>>> 115 3e-29
>>>>
>>>> 1_0 472
>>>> ccgacaataattcttgttggaatcttcggcagttttttgtacaggagccagtagttcaaa 531
>>>> ______00038 883
>>>> ccgacaataattcttgttggaatcttcggcagttttttgtacaggagccagtagttcaaa 942
>>>> ______00053 ------------------------------------------------------------
>>>>
>>>> 1_0 532
>>>> aagaaagcgatcaataaaa-taaaaatcacaaaaaaattaccaaaaacatatttataaat 590
>>>> ______00038 943
>>>> aagaaagcgatcaataaaaataaaaatcacaaaaaaattaccaaaaacatatttataaa- 1001
>>>> ______00053 ------------------------------------------------------------
>>>>
>>>> 1_0 591
>>>> attggcaaaaaaattgccaacaattcccaaacggaaaattcccaaaacaaagagagcgtc 650
>>>> ______00038 1000
>>>> ------------------------------------------------------------ 1001
>>>> ______00053 ------------------------------------------------------------
>>>>
>>>> 1_0 651
>>>> gataaccaatatcaaaatagtttttgaatttattttttgtgtttttttagtttttcttct 710
>>>> ______00038 1000
>>>> ------------------------------------------------------------ 1001
>>>> ______00053 ------------------------------------------------------------
>>>>
>>>> 1_0 711
>>>> acgtcgtgttgccatttatccagcattaagtctataaaaaaaaacggtcagataaaaatg 770
>>>> ______00038 1000
>>>> ------------------------------------------------------------ 1001
>>>> ______00053 1 -------------------------ttaagtctataaaaaaaa-cggtcagataaaaatg 
>>>> 34
>>>>
>>>> 1_0 771 ccttaagtatttactttaacttgtcttgatca 802
>>>> ______00038 1000 -------------------------------- 1001
>>>> ______00053 35 ccttaagtatt-actttaacttgtcttgatca 65
>>>> Database: 00038-00053.fasta
>>>> Posted date: Feb 25, 2010 4:47 PM
>>>> Number of letters in database: 2001
>>>> Number of sequences in database: 2
>>>>
>>>> Lambda K H
>>>> 1.37 0.711 1.31
>>>>
>>>> Gapped
>>>> Lambda K H
>>>> 1.37 0.711 1.31
>>>>
>>>>
>>>> Matrix: blastn matrix:1 -3
>>>> Gap Penalties: Existence: 0, Extension: 0
>>>> Number of Sequences: 2
>>>> Number of Hits to DB: 17
>>>> Number of extensions: 3
>>>> Number of successful extensions: 3
>>>> Number of sequences better than 10.0: 2
>>>> Number of HSP's gapped: 2
>>>> Number of HSP's successfully gapped: 2
>>>> Length of query: 802
>>>> Length of database: 2001
>>>> Length adjustment: 10
>>>> Effective length of query: 792
>>>> Effective length of database: 1981
>>>> Effective search space: 1568952
>>>> Effective search space used: 1568952
>>>> X1: 9 (17.8 bits)
>>>> X2: 20 (39.6 bits)
>>>> X3: 51 (101.1 bits)
>>>> S1: 9 (18.3 bits)
>>>> S2: 9 (18.3 bits)
>>>>
>>>
>>
>> _______________________________________________
>> Bioperl-l mailing list
>> Bioperl-l at lists.open-bio.org
>> http://lists.open-bio.org/mailman/listinfo/bioperl-l
>
>

_______________________________________________
Bioperl-l mailing list
Bioperl-l at lists.open-bio.org
http://lists.open-bio.org/mailman/listinfo/bioperl-l





More information about the Bioperl-l mailing list