[Bioperl-l] [Gmod-gbrowse] scores in Bio::DB::BigBed

Sun Jul 3 20:10:25 UTC 2011

Hi Daniel,

You are correct about the bin and summary function of the BigBed adaptor
working only with the number of features and not the individual scores.

There is a workaround, albeit not as efficient as the statistical method.
In the conf stanza, you'll need to use the region feature, and then use
the older xyplot glyph. This glyph will iterate through all the bed
features, calling the score method on each, and then draw an xyplot with
those collected scores. Be sure to set the group_on function to tie them
all into one graph. Here is an example.

[bigbed_score_graph]
database     = bigbed_db
feature      = region
glyph        = xyplot
graph_type   = line
group_on     = type

As for the BED format, per the format definition from UCSC, the first
three columns (chromosome, start, stop) are required, and any additional
higher number columns must have the lower columns filled. So to include a
score (5th column), you need to also fill the name (4th) column.

If your features don't have names, then I would recommend using the BigWig
format instead. You can load a bedgraph file (chromosome, start, stop,
score) into a BigWig database. You'll also have access to the fast summary
statistical functions that work on the scores.

Hope that helps.
Tim

On 7/3/11 3:48 AM, "Daniel Lang" <Daniel.Lang at biologie.uni-freiburg.de>
wrote:

>Hi,
>
>quick question about the BigBed adaptor: Is it correct that the bin and
>summary functions only return statistics about the number of features in
>the defined intervals?
>I was expecting them to deliver statistics about the score if the
>respective bb file has a defined score field.
>If this is true, does this also mean that I cannot plot the distribution
>of scores in BigBed files in gbrowse?
>
>This is the first time I'm using BigBed, maybe I'm doing something
>wrong...
>
>I had some trouble formatting the bed files correctly in order to see
>the score in the features returned by the Bio::DB::BigBed::features()
>routine. It seems the bigbed entries will only have a correctly assigned
>score field if you also provide a non-empty name field. Initially I
>thought that the order of columns is irrelevant if you use an .as file
>in the bedToBigBed call, but that doesn't seem to be the case.
>
>Best,
>Daniel
>-- 
>
>Dr. Daniel Lang
>University of Freiburg, Plant Biotechnology
>Schaenzlestr. 1, D-79104 Freiburg
>fax:        +49 761 203 6945
>phone:      +49 761 203 6989
>homepage:   http://www.plant-biotech.net/
>            http://www.cosmoss.org/
>e-mail:     daniel.lang at biologie.uni-freiburg.de
>
>#################################################
>My software never has bugs.
>It just develops random features.
>#################################################
>
>
>
>
>--------------------------------------------------------------------------
>----
>All of the data generated in your IT infrastructure is seriously valuable.
>Why? It contains a definitive record of application performance, security
>threats, fraudulent activity, and more. Splunk takes this data and makes
>sense of it. IT sense. And common sense.
>http://p.sf.net/sfu/splunk-d2d-c2
>_______________________________________________
>Gmod-gbrowse mailing list
>Gmod-gbrowse at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/gmod-gbrowse