Getting Started (original) (raw)
D3.js
is a famous JavaScript library that allows one to create extremely flexible SVG graphics however D3
has (at least according to me) a pretty steep learning curve. Further, in order to understand some core concepts, one need to have some basics inHTML
, CSS
and JavaScript
.ddplot
aims to simply the process using a set of functions that render several graphics using a simple R
API. Finally,ddplot
is built upon the amazing r2d3
package which makes it a breeze to interface D3.js
withR
, so a big thanks to the developers.
scatterPlot()
Let’s work with the mpg
data frame from theggplot2
package.
library(ggplot2) # needed for the mpg data frame
scatterPlot(
data = mpg,
x = "hwy",
y = "cty",
xtitle = "hwy variable",
ytitle = "cty variable",
title = "cty and hwy relationship",
titleFontSize = 20
)
In comparison to ggplot2
, graphics’ customization inddplot
is limited nonetheless you get a fully vectorized SVG which is cool.
scatterPlot(
data = mpg,
x = "displ",
y = "cty",
col = "tomato",
bgcol = "pink",
size = 3,
stroke = "royalblue",
strokeWidth = 1,
xtitle = "displ variable",
ytitle = "cty variable",
xticks = 3,
yticks = 3)
histogram()
The [histogram()](../reference/histogram.html)
function allows you to visualize the distribution of a vector of data:
histogram(
x = mpg$hwy,
bins = 20,
fill = "crimson",
stroke = "white",
strokeWidth = 1,
title = "Distribution of the hwy variable",
width = "20",
height = "10"
)
animatedHistogram()
This function allows you to create a one-click histogram animation. Useful for presentation purposes. Click on the following empty plot and see what happens:
animatedHistogram(
x = mpg$hwy,
duration = 2000,
delay = 100,
fill = "lime",
stroke = "white",
bgcol = "white"
)
Note that you can customize the animation using the two parametersduration
and delay
.
barChart()
The barChat()
function allows you to create bar charts however you need to make the aggregation beforehand. In the following example, we will plot the average cty
for eachmanufacturer
using the dplyr
package.
library(dplyr)
mpg %>% group_by(manufacturer) %>%
summarise(mean_cty = mean(cty)) %>%
barChart(
x = "manufacturer",
y = "mean_cty",
xFontSize = 10,
yFontSize = 10,
fill = "orange",
strokeWidth = 2,
ytitle = "average cty value",
title = "Average City Miles per Gallon by manufacturer"
)
The bars can be easily sorted in ascending
ordescending
order using the sort
parameter:
mpg %>% group_by(manufacturer) %>%
summarise(mean_cty = mean(cty)) %>%
barChart(
x = "manufacturer",
y = "mean_cty",
sort = "ascending",
xFontSize = 10,
yFontSize = 10,
fill = "orange",
strokeWidth = 1,
ytitle = "average cty value",
title = "Average City Miles per Gallon by manufacturer",
titleFontSize = 16
)
horzBarChart()
If you’ve many categories, it might be a good idea to go for a horizontal bar chart. It has the same parameters as the[barChart()](../reference/barChart.html)
function except that the x-axis parameter is named value
and the y-axis parameter namedlabel
, this naming convention aims to mitigate some confusion that can arise.
If we want to replicate the above graphic in a horizontal way, we can do:
mpg %>% group_by(manufacturer) %>%
summarise(mean_cty = mean(cty)) %>%
horzBarChart(
label = "manufacturer",
value = "mean_cty",
sort = "ascending",
labelFontSize = 10,
valueFontSize = 10,
fill = "orange",
stroke = "crimson",
strokeWidth = 1,
valueTitle = "average cty value",
title = "Average City Miles per Gallon by manufacturer",
titleFontSize = 16
)
As in [barChart()](../reference/barChart.html)
, we can aslo sort in descending order:
mpg %>% group_by(manufacturer) %>%
summarise(mean_cty = mean(cty)) %>%
horzBarChart(
label = "manufacturer",
value = "mean_cty",
sort = "descending",
labelFontSize = 10,
valueFontSize = 10,
bgcol = "black",
axisCol = "white",
fill = "white",
stroke = "white",
strokeWidth = 1,
valueTitle = "average cty value",
labelTitle = "Manufacturers",
title = "Average City Miles per Gallon by manufacturer",
titleFontSize = 16
)
lollipopChart()
lollipop chart follows the same behavior as bar charts but instead of bars you get lollipops, hence the name. Below an example of a lollipop chart with ddplot
:
It’s possible to grasp the distribution of some variable according to a specific categorical variable using the same function:
mpg %>% filter(year == 2008) %>%
lollipopChart(
x = "manufacturer",
y = "hwy",
circleFill = 'red',
circleStroke = 'orange',
circleRadius = 5,
sort = "none",
xFontSize = 10
)
From above, it’s quite easy to notice that although Toyota has two cars with high highway miles per galon (hwy), it also produces many other vehicles with poor hwy.
horzLollipop()
Same with bar charts, if you have a variable that has many categorical values, you can work with the reversed version of[lollipopChart()](../reference/lollipopChart.html)
which is [horzLollipop()](../reference/horzLollipop.html)
:
You can also do:
mpg %>% filter(year == 2008) %>%
horzLollipop(
label = "manufacturer",
value = "hwy",
circleFill = 'red',
circleStroke = 'orange',
circleRadius = 5,
sort = "none"
)
pieChart()
Pie charts and donut charts are pretty straightforward to set up. We’ll use a sample from the starwars
data frame to plot a simple pie chart.
# starwars is part of the dplyr data frame
mini_starwars <- starwars %>% tidyr::drop_na(mass) %>%
sample_n(size = 5) # getting 5 random values
pieChart(
data = mini_starwars,
value = "mass",
label = "name"
)
Using the padRadius
, padAngle
andcornerRadius
parameters, one can get fanciers pie charts:
pieChart(
data = mini_starwars,
value = "mass",
label = "name",
padRadius = 200,
padAngle = 0.1,
cornerRadius = 50,
innerRadius = 10
)
If you need a donut chart, you just need to play with theinnerRadius
parameter:
pieChart(
data = mini_starwars,
value = "mass",
label = "name",
innerRadius = 120,
cornerRadius = 20,
title = "5 Starwars characters ranked by their mass",
titleFontSize = 16,
bgcol = "yellow"
)
lineChart()
The [lineChart()](../reference/lineChart.html)
function is used to plot time series data. The use must provide a date
variable that has theyyyy-mm-dd
format. In the following example, we’ll use theAir Passenger
built-in ts
data and convert it to a classical data frame:
# 1. converting AirPassengers to a tidy data frame
airpassengers <- data.frame(
passengers = as.matrix(AirPassengers),
date= zoo::as.Date(time(AirPassengers))
)
# 2. plotting the line chart
lineChart(
data = airpassengers,
x = "date",
y = "passengers"
)
You can modify the line interpolation using the curve
parameter:
lineChart(
data = airpassengers,
x = "date",
y = "passengers",
curve = "curveStep"
)
lineChart(
data = airpassengers,
x = "date",
y = "passengers",
curve = "curveCardinal"
)
lineChart(
data = airpassengers,
x = "date",
y = "passengers",
curve = "curveBasis"
)
animLineChart()
Heavily inspired from Jure Stabuc’s example, the [animLineChart()](../reference/animLineChart.html)
function create an empty SVG but when each time you click on it a line chart animation starts. Note that the line lasts after the end of the animation. Go ahead, click on the empty graphic below:
animLineChart(
data = airpassengers,
x = "date",
y = "passengers",
duration = 10000, # in milliseconds (10 seconds)
curve = "curveCardinal"
)
areaChart()
[areaChart()](../reference/areaChart.html)
works similarly except that instead of a line you get an area.
# 1. converting AirPassengers to a tidy data frame
airpassengers <- data.frame(
passengers = as.matrix(AirPassengers),
date= zoo::as.Date(time(AirPassengers))
)
# 2. plotting the area chart
areaChart(
data = airpassengers,
x = "date",
y = "passengers",
fill = "purple",
bgcol = "white"
)
areaBand()
[areaBand()](../reference/areaBand.html)
lets you plot a filled area between two y-values. For the sake of the example, let’s create an additional columnpassengers_upper
that has an additional 40 passengers for each observation:
airpassengers <- data.frame(
passengers_lower = as.matrix(AirPassengers),
passengers_upper = as.matrix(AirPassengers) + 40,
date= zoo::as.Date(time(AirPassengers))
)
areaBand(
data = airpassengers,
x = "date",
yLower = "passengers_lower",
yUpper = "passengers_upper",
fill = "yellow",
stroke = "black"
)
stackedAreaChart()
This function allows you to create a stacked area chart. You need two components:
- A data frame in wide format (see an example below). If it’s in wide format, you can still use
pivot_wider()
from thetidyr
package to make wider. - A date variable in
yyyy-mm-dd
format that will plotted in the x-axis.
Let’s work with the following data frame (shortened) provided by Mike Bostock in his stacked area chart example:
data <- data.frame(
date = c(
"2000-01-01", "2000-02-01", "2000-03-01", "2000-04-01",
"2000-05-01", "2000-06-01", "2000-07-01",
"2000-08-01", "2000-09-01", "2000-10-01"
),
Trade = c(
2000,1023, 983, 2793, 1821, 1837, 1792, 1853, 791, 739
),
Manufacturing = c(
734, 694, 739, 736, 685, 621, 708, 685, 667, 693
),
Leisure = c(
1782, 1779, 1789, 658, 675, 833, 786, 675, 636, 691
),
Agriculture = c(
655, 587,623, 517, 561, 2545, 636, 584, 559, 2504
)
)
data
#> date Trade Manufacturing Leisure Agriculture
#> 1 2000-01-01 2000 734 1782 655
#> 2 2000-02-01 1023 694 1779 587
#> 3 2000-03-01 983 739 1789 623
#> 4 2000-04-01 2793 736 658 517
#> 5 2000-05-01 1821 685 675 561
#> 6 2000-06-01 1837 621 833 2545
#> 7 2000-07-01 1792 708 786 636
#> 8 2000-08-01 1853 685 675 584
#> 9 2000-09-01 791 667 636 559
#> 10 2000-10-01 739 693 691 2504
Note that when running [stackedAreaChart()](../reference/stackedAreaChart.html)
all the variables available within the considered data frame will be plotted. If you want to restrict the plotting to only specific variables, just drop the unneeded columns:
You can modify the color scheme using the colorCategory
parameter:
stackedAreaChart(
data = data,
x = "date",
legendTextSize = 14,
curve = "curveCardinal",
colorCategory = "Accent",
bgcol = "white",
stroke = "black",
strokeWidth = 1
)
stackedAreaChart(
data = data,
x = "date",
legendTextSize = 14,
curve = "curveBasis",
colorCategory = "Set3",
bgcol = "black",
axisCol = "white",
xticks = 4,
stroke = "black"
)
You can find list of D3 categorical color schemes here
Finally, if you hover over the chart you’ll notice a tooltip that identified the different area categories.
barChartRace()
This function allows you to create an animated bar chart race.[barChartRace()](../reference/barChartRace.html)
is similar to [barChart()](../reference/barChart.html)
but takes a third variable mapped to the time dimension, with options for styling transitions.
Let’s make a bar chart race of population growth among various countries using a subset of the gapminder
dataset from the{gapminder} package:
gapminder_subset <- gapminder::gapminder %>%
select(country, year, pop) %>%
filter(country %in% c("Japan", "Mexico", "Germany", "Brazil", "Philippines", "Vietnam")) %>%
mutate(pop = pop/1e6)
gapminder_subset %>%
slice_sample(n = 10)
#> year pop country
#> 1 2007 91.07729 Philippines
#> 2 1997 76.04900 Vietnam
#> 3 1972 107.18827 Japan
#> 4 1967 39.46391 Vietnam
#> 5 1952 30.14432 Mexico
#> 6 1987 142.93808 Brazil
#> 7 1997 168.54672 Brazil
#> 8 1962 41.12148 Mexico
#> 9 1952 69.14595 Germany
#> 10 1957 91.56301 Japan
In this example, we simply pass call [barChartRace()](../reference/barChartRace.html)
like[barChart()](../reference/barChart.html)
, but with an additional variable mapped to the time dimension specified with time = year
:
gapminder_subset %>%
barChartRace(
x = "pop",
y = "country",
time = "year",
ytitle = "Country",
xtitle = "Population (in millions)",
title = "Bar chart race of country populations"
)
You can also stylize transitions with the frameDur
,transitionDur
, and ease
arguments. For example, setting the time spent pausing on each frame to zero withframeDur = 0
will create a smooth animation:
gapminder_subset %>%
barChartRace(
x = "pop",
y = "country",
time = "year",
transitionDur = 1000,
frameDur = 0,
ytitle = "Country",
xtitle = "Population (in millions)",
title = "Bar chart race of country populations"
)
As you might have noticed, the value of the column passed to thetime
argument is automatically labelled at the bottom-right corner of the plot panel. We can stylize this with a list of options passed to the timeLabelOpts
argument (or turn it off withtimeLabel = FALSE
). We also give the bars a little bounce here with ease = "BackInOut"
for fun.
gapminder_subset %>%
barChartRace(
x = "pop",
y = "country",
time = "year",
ease = "BackInOut",
ytitle = "Country",
xtitle = "Population (in millions)",
title = "Bar chart race of country populations",
timeLabelOpts = list(
size = 40,
prefix = "Year: ",
xOffset = 0.2
)
)