Hacking Highcharter: observations per group in boxplots
Highcharts has long been a favourite visualisation library of mine, and I’ve written before about Highcharter, my preferred way to use Highcharts in R.
Highcharter has a nice simple function, hcboxplot(), to generate boxplots. I recently generated some for a project at work and was asked: can we see how many observations make up the distribution for each category? This is a common issue with boxplots and there are a few solutions such as: overlay the box on a jitter plot to get some idea of the number of points, or try a violin plot, or a so-called bee-swarm plot. In Highcharts, I figured there should be a method to get the number of observations, which could then be displayed in a tool-tip on mouse-over.
There wasn’t, so I wrote one like this.
First, you’ll need to install highcharter from Github to make it work with the latest dplyr.
Next, we generate a reproducible dataset using the wakefield package. For some reason, we want to look at age by gender, but only for redheads:
library(dplyr)
library(tidyr)
library(highcharter)
library(wakefield)
library(tibble)
set.seed(1001)
sample_data <- r_data_frame(
n = 1000,
age(x = 10:90),
gender,
hair
) %>%
filter(hair == "Red")
sample_data %>%
count(Gender)
## # A tibble: 2 x 2
## Gender n
## <fctr> <int>
## 1 Male 62
## 2 Female 48
Giving us 62 male and 48 female redheads. The tibble package is required because later on, our boxplot function ...
Source: What You're Doing Is Rather Desperate - Category: Bioinformatics Authors: nsaunders Tags: R statistics Source Type: blogs