[Bioperl-l] Aggressive aggregation?
Lincoln Stein
lstein at cshl.edu
Wed Mar 9 15:47:29 EST 2005
Each of the multiple hits should have its own unique target name. You
can do this by appending a .01, .02, etc to the end of the Target
name.
Lincoln
On Tuesday 08 March 2005 10:15 pm, Chad Matsalla wrote:
> Subject: Aggressive Aggregators
>
> Greetings all,
>
> I'm looking for help in presenting Blast hits in GBrowse.
>
> I blasted Brassica EST sequences against the Arabidopsis
> pseudochromosome assemblies in order to store them in a
> Bio::DB::GFF database. I used a tool based bp_search2gff.pl to
> `convert' blast reports into gff. A sample of that gff is below[1].
>
> My problem is partly based on a peculiarity of Blast and partly
> based on the behavior of the aggregators in GBrowse and I'm
> wondering if someone else has seen this.
>
> Arabidopsis has five chromosomes. In order to get the coordinates
> necessary to place ESTs on the chromosomes I created a blast
> database containing 5 query sequences - chr1, chr2, chr3, chr4,
> chr5.
>
> My problem presents itself when an EST hits at more than once place
> on a Chromosome. Let us say that on chr1 there is a cluster of
> HSPs for the est chad1 at position 1000, a second cluster at
> position 10,000 and a third cluster at 50,000. Blast will indicate
> a SINGLE hit on chr1.
>
> SO, I manually find clusters of HSPs and create GFF that resembles
> that below[1]. Yes I know that wublast has an option to prevent
> that behavior.
>
> The problem is that the `match' aggregator joins all of the
> `matches' together. I understand that it's because all of the
> matches have the same Target - that's necessary to have the proper
> sequence appear while viewing base-base alignments.
>
> HSPs: <--> <--> <--> <--> <--> <-->
> matches: <--------------> <-------------->
>
> What I get : <-->--<-->--<-->-----------------<-->--<-->--<-->
> What I want: <-->--<-->--<--> <-->--<-->--<-->
>
> How do I get what I want? In my gbrowse.conf I tried the standard
> `match' aggregator and a custom aggregator:
> csmmatch{csmhsp/csmmatch}
>
>
> Chad Matsalla
>
>
> [1]
> chr1 aafcest HSP 1 75 . + . Target
> "Sequence:chad1" 1 75 chr1 aafcest HSP 100 150 . +
> . Target "Sequence:chad1" 100 150 chr1 aafcest match 1
> 150 . + . Target "Sequence:chad1" 1 150
>
> chr1 aafcest HSP 200 275 . - . Target
> "Sequence:chad1" 200 275 chr1 aafcest HSP 300 450 . -
> . Target "Sequence:chad1" 300 450 chr1 aafcest match
> 200 450 . - . Target "Sequence:chad1" 200 450
>
>
> _______________________________________________
> Bioperl-l mailing list
> Bioperl-l at portal.open-bio.org
> http://portal.open-bio.org/mailman/listinfo/bioperl-l
--
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
NOTE: Please copy Sandra Michelsen <michelse at cshl.edu> on
all emails regarding scheduling and other time-critical topics.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://portal.open-bio.org/pipermail/bioperl-l/attachments/20050309/5b4ea539/attachment-0001.bin
More information about the Bioperl-l
mailing list