[Bioperl-l] Aggressive aggregation?

Lincoln Stein lstein at cshl.edu
Wed Mar 9 15:47:29 EST 2005


Each of the multiple hits should have its own unique target name.  You 
can do this by appending a .01, .02, etc to the end of the Target 
name.

Lincoln


On Tuesday 08 March 2005 10:15 pm, Chad Matsalla wrote:
> Subject: Aggressive Aggregators
>
> Greetings all,
>
> I'm looking for help in presenting Blast hits in GBrowse.
>
> I blasted Brassica EST sequences against the Arabidopsis
> pseudochromosome assemblies in order to store them in a
> Bio::DB::GFF database. I used a tool based bp_search2gff.pl to
> `convert' blast reports into gff. A sample of that gff is below[1].
>
> My problem is partly based on a peculiarity of Blast and partly
> based on the behavior of the aggregators in GBrowse and I'm
> wondering if someone else has seen this.
>
> Arabidopsis has five chromosomes. In order to get the coordinates
> necessary to place ESTs on the chromosomes I created a blast
> database containing 5 query sequences - chr1, chr2, chr3, chr4,
> chr5.
>
> My problem presents itself when an EST hits at more than once place
> on a Chromosome.  Let us say that on chr1 there is a cluster of
> HSPs for the est chad1 at position 1000, a second cluster at
> position 10,000 and a third cluster at 50,000. Blast will indicate
> a SINGLE hit on chr1.
>
> SO, I manually find clusters of HSPs and create GFF that resembles
> that below[1]. Yes I know that wublast has an option to prevent
> that behavior.
>
> The problem is that the `match' aggregator joins all of the
> `matches' together.  I understand that it's because all of the
> matches have the same Target - that's necessary to have the proper
> sequence appear while viewing base-base alignments.
>
> HSPs:        <-->  <-->  <-->                 <-->  <-->  <-->
> matches:     <-------------->                 <-------------->
>
> What I get : <-->--<-->--<-->-----------------<-->--<-->--<-->
> What I want: <-->--<-->--<-->                 <-->--<-->--<-->
>
> How do I get what I want? In my gbrowse.conf I tried the standard
> `match' aggregator and a custom aggregator:
> csmmatch{csmhsp/csmmatch}
>
>
> Chad Matsalla
>
>
> [1]
> chr1 aafcest     HSP   1     75    .     +     .     Target
> "Sequence:chad1" 1 75 chr1 aafcest     HSP   100   150   .     +   
>  .     Target "Sequence:chad1" 100 150 chr1 aafcest     match 1    
> 150   .     +     .     Target "Sequence:chad1" 1 150
>
> chr1 aafcest     HSP   200   275   .     -     .     Target
> "Sequence:chad1" 200 275 chr1 aafcest     HSP   300   450   .     -
>     .     Target "Sequence:chad1" 300 450 chr1 aafcest     match
> 200   450   .     -     .     Target "Sequence:chad1" 200 450
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724

NOTE: Please copy Sandra Michelsen <michelse at cshl.edu> on
all emails regarding scheduling and other time-critical topics.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20050309/5b4ea539/attachment-0001.bin


More information about the Bioperl-l mailing list