Title: | Extension to 'ggplot2' for Plotting Stats |
---|---|
Description: | Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots. |
Authors: | Joseph Larmarange [aut, cre] |
Maintainer: | Joseph Larmarange <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.7.0.9000 |
Built: | 2024-11-10 06:18:55 UTC |
Source: | https://github.com/larmarange/ggstats |
Augment a chi-squared test and compute phi coefficients
augment_chisq_add_phi(x)
augment_chisq_add_phi(x)
x |
a chi-squared test as returned by |
Phi coefficients are a measurement of the degree of association between two binary variables.
A value between -1.0 to -0.7 indicates a strong negative association.
A value between -0.7 to -0.3 indicates a weak negative association.
A value between -0.3 to +0.3 indicates a little or no association.
A value between +0.3 to +0.7 indicates a weak positive association.
A value between +0.7 to +1.0 indicates a strong positive association.
A tibble
.
stat_cross()
, GDAtools::phi.table()
or psych::phi()
tab <- xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic)) augment_chisq_add_phi(chisq.test(tab))
tab <- xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic)) augment_chisq_add_phi(chisq.test(tab))
These geometries are variations of ggplot2::geom_bar()
and
ggplot2::geom_text()
but provides different set of default values.
geom_diverging( mapping = NULL, data = NULL, position = "diverging", ..., complete = "fill", default_by = "total" ) geom_likert( mapping = NULL, data = NULL, position = "likert", ..., complete = "fill", default_by = "x" ) geom_pyramid( mapping = NULL, data = NULL, position = "diverging", ..., complete = NULL, default_by = "total" ) geom_diverging_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_diverging(0.5), ..., complete = "fill", default_by = "total" ) geom_likert_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_likert(0.5), ..., complete = "fill", default_by = "x" ) geom_pyramid_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_diverging(0.5), ..., complete = NULL, default_by = "total" )
geom_diverging( mapping = NULL, data = NULL, position = "diverging", ..., complete = "fill", default_by = "total" ) geom_likert( mapping = NULL, data = NULL, position = "likert", ..., complete = "fill", default_by = "x" ) geom_pyramid( mapping = NULL, data = NULL, position = "diverging", ..., complete = NULL, default_by = "total" ) geom_diverging_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_diverging(0.5), ..., complete = "fill", default_by = "total" ) geom_likert_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_likert(0.5), ..., complete = "fill", default_by = "x" ) geom_pyramid_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = position_diverging(0.5), ..., complete = NULL, default_by = "total" )
mapping |
Optional set of aesthetic mappings. |
data |
The data to be displayed in this layers. |
position |
A position adjustment to use on the data for this layer. |
... |
Other arguments passed on to |
complete |
An aesthetic for those unobserved values should be completed,
see |
default_by |
Name of an aesthetic determining denominators by default,
see |
geom_diverging()
is designed for stacked diverging bar plots, using
position_diverging()
.
geom_likert()
is designed for Likert-type items. Using
position_likert()
(each bar sums to 100%).
geom_pyramid()
is similar to geom_diverging()
but uses
proportions of the total instead of counts.
To add labels on the bar plots, simply use geom_diverging_text()
,
geom_likert_text()
, or geom_pyramid_text()
.
All these geometries relies on stat_prop()
.
library(ggplot2) ggplot(diamonds) + aes(x = clarity, fill = cut) + geom_diverging() ggplot(diamonds) + aes(x = clarity, fill = cut) + geom_diverging(position = position_diverging(cutoff = 4)) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_likert() + geom_likert_text() ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_likert() + geom_likert_text( aes( label = label_percent_abs(accuracy = 1, hide_below = .10)( after_stat(prop) ), colour = after_scale(hex_bw(.data$fill)) ) ) d <- Titanic |> as.data.frame() ggplot(d) + aes(y = Class, fill = Sex, weight = Freq) + geom_diverging() + geom_diverging_text() ggplot(d) + aes(y = Class, fill = Sex, weight = Freq) + geom_pyramid() + geom_pyramid_text()
library(ggplot2) ggplot(diamonds) + aes(x = clarity, fill = cut) + geom_diverging() ggplot(diamonds) + aes(x = clarity, fill = cut) + geom_diverging(position = position_diverging(cutoff = 4)) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_likert() + geom_likert_text() ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_likert() + geom_likert_text( aes( label = label_percent_abs(accuracy = 1, hide_below = .10)( after_stat(prop) ), colour = after_scale(hex_bw(.data$fill)) ) ) d <- Titanic |> as.data.frame() ggplot(d) + aes(y = Class, fill = Sex, weight = Freq) + geom_diverging() + geom_diverging_text() ggplot(d) + aes(y = Class, fill = Sex, weight = Freq) + geom_pyramid() + geom_pyramid_text()
geom_prop_bar()
and geom_prop_text()
are variations of
ggplot2::geom_bar()
and ggplot2::geom_text()
using stat_prop()
,
with custom default aesthetics: after_stat(prop)
for x or y, and
scales::percent(after_stat(prop))
for label.
geom_prop_bar( mapping = NULL, data = NULL, position = "stack", ..., complete = NULL, default_by = "x" ) geom_prop_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = ggplot2::position_stack(0.5), ..., complete = NULL, default_by = "x" )
geom_prop_bar( mapping = NULL, data = NULL, position = "stack", ..., complete = NULL, default_by = "x" ) geom_prop_text( mapping = ggplot2::aes(!!!auto_contrast), data = NULL, position = ggplot2::position_stack(0.5), ..., complete = NULL, default_by = "x" )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
complete |
Name (character) of an aesthetic for those statistics should be completed for unobserved values (see example). |
default_by |
If the by aesthetic is not available, name of another
aesthetic that will be used to determine the denominators (e.g. |
library(ggplot2) d <- as.data.frame(Titanic) ggplot(d) + aes(y = Class, fill = Survived, weight = Freq) + geom_prop_bar() + geom_prop_text() ggplot(d) + aes( y = Class, fill = Survived, weight = Freq, x = after_stat(count), label = after_stat(count) ) + geom_prop_bar() + geom_prop_text()
library(ggplot2) d <- as.data.frame(Titanic) ggplot(d) + aes(y = Class, fill = Survived, weight = Freq) + geom_prop_bar() + geom_prop_text() ggplot(d) + aes( y = Class, fill = Survived, weight = Freq, x = after_stat(count), label = after_stat(count) ) + geom_prop_bar() + geom_prop_text()
Add alternating background color along the y-axis. The geom takes default
aesthetics odd
and even
that receive color codes.
geom_stripped_rows( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, xfrom = -Inf, xto = Inf, width = 1, nudge_y = 0 ) geom_stripped_cols( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, yfrom = -Inf, yto = Inf, width = 1, nudge_x = 0 )
geom_stripped_rows( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, xfrom = -Inf, xto = Inf, width = 1, nudge_y = 0 ) geom_stripped_cols( mapping = NULL, data = NULL, stat = "identity", position = "identity", ..., show.legend = NA, inherit.aes = TRUE, yfrom = -Inf, yto = Inf, width = 1, nudge_x = 0 )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
xfrom , xto
|
limitation of the strips along the x-axis |
width |
width of the strips |
yfrom , yto
|
limitation of the strips along the y-axis |
nudge_x , nudge_y
|
horizontal or vertical adjustment to nudge strips by |
A ggplot2
plot with the added geometry.
data(tips, package = "reshape") library(ggplot2) p <- ggplot(tips) + aes(x = time, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_cols() p + geom_stripped_rows() + geom_stripped_cols() p <- ggplot(tips) + aes(x = total_bill, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_rows() + scale_y_discrete(expand = expansion(0, 0.5)) p + geom_stripped_rows(xfrom = 10, xto = 35) p + geom_stripped_rows(odd = "blue", even = "yellow") p + geom_stripped_rows(odd = "blue", even = "yellow", alpha = .1) p + geom_stripped_rows(odd = "#00FF0022", even = "#FF000022") p + geom_stripped_cols() p + geom_stripped_cols(width = 10) p + geom_stripped_cols(width = 10, nudge_x = 5)
data(tips, package = "reshape") library(ggplot2) p <- ggplot(tips) + aes(x = time, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_cols() p + geom_stripped_rows() + geom_stripped_cols() p <- ggplot(tips) + aes(x = total_bill, y = day) + geom_count() + theme_light() p p + geom_stripped_rows() p + geom_stripped_rows() + scale_y_discrete(expand = expansion(0, 0.5)) p + geom_stripped_rows(xfrom = 10, xto = 35) p + geom_stripped_rows(odd = "blue", even = "yellow") p + geom_stripped_rows(odd = "blue", even = "yellow", alpha = .1) p + geom_stripped_rows(odd = "#00FF0022", even = "#FF000022") p + geom_stripped_cols() p + geom_stripped_cols(width = 10) p + geom_stripped_cols(width = 10, nudge_x = 5)
ggcascade( .data, ..., .weights = NULL, .by = NULL, .nrow = NULL, .ncol = NULL, .add_n = TRUE, .text_size = 4, .arrows = TRUE ) compute_cascade(.data, ..., .weights = NULL, .by = NULL) plot_cascade( .data, .by = NULL, .nrow = NULL, .ncol = NULL, .add_n = TRUE, .text_size = 4, .arrows = TRUE )
ggcascade( .data, ..., .weights = NULL, .by = NULL, .nrow = NULL, .ncol = NULL, .add_n = TRUE, .text_size = 4, .arrows = TRUE ) compute_cascade(.data, ..., .weights = NULL, .by = NULL) plot_cascade( .data, .by = NULL, .nrow = NULL, .ncol = NULL, .add_n = TRUE, .text_size = 4, .arrows = TRUE )
.data |
A data frame, or data frame extension (e.g. a tibble). For
|
... |
< |
.weights |
< |
.by |
< |
.nrow , .ncol
|
Number of rows and columns, for faceted plots. |
.add_n |
Display the number of observations? |
.text_size |
Size of the labels, passed to |
.arrows |
Display arrows between statuses? |
ggcascade()
calls compute_cascade()
to generate a data set passed
to plot_cascade()
. Use compute_cascade()
and plot_cascade()
for
more controls.
A ggplot2
plot or a tibble
.
ggplot2::diamonds |> ggcascade( all = TRUE, big = carat > .5, "big & ideal" = carat > .5 & cut == "Ideal" ) ggplot2::mpg |> ggcascade( all = TRUE, recent = year > 2000, "recent & economic" = year > 2000 & displ < 3, .by = cyl, .ncol = 3, .arrows = FALSE, .text_size = 3 ) ggplot2::mpg |> ggcascade( all = TRUE, recent = year > 2000, "recent & economic" = year > 2000 & displ < 3, .by = pick(cyl, drv), .add_n = FALSE, .text_size = 2 )
ggplot2::diamonds |> ggcascade( all = TRUE, big = carat > .5, "big & ideal" = carat > .5 & cut == "Ideal" ) ggplot2::mpg |> ggcascade( all = TRUE, recent = year > 2000, "recent & economic" = year > 2000 & displ < 3, .by = cyl, .ncol = 3, .arrows = FALSE, .text_size = 3 ) ggplot2::mpg |> ggcascade( all = TRUE, recent = year > 2000, "recent & economic" = year > 2000 & displ < 3, .by = pick(cyl, drv), .add_n = FALSE, .text_size = 2 )
ggcoef_model()
, ggcoef_table()
, ggcoef_multinom()
,
ggcoef_multicomponents()
and ggcoef_compare()
use broom.helpers::tidy_plus_plus()
to obtain a tibble
of the model coefficients,
apply additional data transformation and then pass the
produced tibble
to ggcoef_plot()
to generate the plot.
ggcoef_model( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = TRUE, signif_stars = TRUE, return_data = FALSE, ... ) ggcoef_table( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = FALSE, signif_stars = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), plot_title = NULL, ... ) ggcoef_compare( models, type = c("dodged", "faceted"), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, ... ) ggcoef_multinom( model, type = c("dodged", "faceted", "table"), y.level_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_multicomponents( model, type = c("dodged", "faceted", "table"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_plot( data, x = "estimate", y = "label", exponentiate = FALSE, point_size = 2, point_stroke = 2, point_fill = "white", colour = NULL, colour_guide = TRUE, colour_lab = "", colour_labels = ggplot2::waiver(), shape = "significance", shape_values = c(16, 21), shape_guide = TRUE, shape_lab = "", errorbar = TRUE, errorbar_height = 0.1, errorbar_coloured = FALSE, stripped_rows = TRUE, strips_odd = "#11111111", strips_even = "#00000000", vline = TRUE, vline_colour = "grey50", dodged = FALSE, dodged_width = 0.8, facet_row = "var_label", facet_col = NULL, facet_labeller = "label_value" )
ggcoef_model( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = TRUE, signif_stars = TRUE, return_data = FALSE, ... ) ggcoef_table( model, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, show_p_values = FALSE, signif_stars = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), plot_title = NULL, ... ) ggcoef_compare( models, type = c("dodged", "faceted"), tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), add_pairwise_contrasts = FALSE, pairwise_variables = broom.helpers::all_categorical(), keep_model_terms = FALSE, pairwise_reverse = TRUE, emmeans_args = list(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, ... ) ggcoef_multinom( model, type = c("dodged", "faceted", "table"), y.level_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_multicomponents( model, type = c("dodged", "faceted", "table"), component_col = "component", component_label = NULL, tidy_fun = broom.helpers::tidy_with_broom_or_parameters, tidy_args = NULL, conf.int = TRUE, conf.level = 0.95, exponentiate = FALSE, variable_labels = NULL, term_labels = NULL, interaction_sep = " * ", categorical_terms_pattern = "{level}", add_reference_rows = TRUE, no_reference_row = NULL, intercept = FALSE, include = dplyr::everything(), significance = 1 - conf.level, significance_labels = NULL, return_data = FALSE, table_stat = c("estimate", "ci", "p.value"), table_header = NULL, table_text_size = 3, table_stat_label = NULL, ci_pattern = "{conf.low}, {conf.high}", table_witdhs = c(3, 2), ... ) ggcoef_plot( data, x = "estimate", y = "label", exponentiate = FALSE, point_size = 2, point_stroke = 2, point_fill = "white", colour = NULL, colour_guide = TRUE, colour_lab = "", colour_labels = ggplot2::waiver(), shape = "significance", shape_values = c(16, 21), shape_guide = TRUE, shape_lab = "", errorbar = TRUE, errorbar_height = 0.1, errorbar_coloured = FALSE, stripped_rows = TRUE, strips_odd = "#11111111", strips_even = "#00000000", vline = TRUE, vline_colour = "grey50", dodged = FALSE, dodged_width = 0.8, facet_row = "var_label", facet_col = NULL, facet_labeller = "label_value" )
model |
a regression model object |
tidy_fun |
( |
tidy_args |
Additional arguments passed to
|
conf.int |
( |
conf.level |
the confidence level to use for the confidence
interval if |
exponentiate |
if |
variable_labels |
( |
term_labels |
( |
interaction_sep |
( |
categorical_terms_pattern |
( |
add_reference_rows |
( |
no_reference_row |
( |
intercept |
( |
include |
( |
add_pairwise_contrasts |
( |
pairwise_variables |
( |
keep_model_terms |
( |
pairwise_reverse |
( |
emmeans_args |
( |
significance |
level (between 0 and 1) below which a
coefficient is consider to be significantly different from 0
(or 1 if |
significance_labels |
optional vector with custom labels for significance variable |
show_p_values |
if |
signif_stars |
if |
return_data |
if |
... |
parameters passed to |
table_stat |
statistics to display in the table, use any column name
returned by the tidier or |
table_header |
optional custom headers for the table |
table_text_size |
text size for the table |
table_stat_label |
optional named list of labeller functions for the displayed statistic (see examples) |
ci_pattern |
glue pattern for confidence intervals in the table |
table_witdhs |
relative widths of the forest plot and the coefficients table |
plot_title |
an optional plot title |
models |
named list of models |
type |
a dodged plot, a faceted plot or multiple table plots? |
y.level_label |
an optional named vector for labeling |
component_col |
name of the component column |
component_label |
an optional named vector for labeling components |
data |
a data frame containing data to be plotted,
typically the output of |
x , y
|
variables mapped to x and y axis |
point_size |
size of the points |
point_stroke |
thickness of the points |
point_fill |
fill colour for the points |
colour |
optional variable name to be mapped to colour aesthetic |
colour_guide |
should colour guide be displayed in the legend? |
colour_lab |
label of the colour aesthetic in the legend |
colour_labels |
labels argument passed to
|
shape |
optional variable name to be mapped to the shape aesthetic |
shape_values |
values of the different shapes to use in
|
shape_guide |
should shape guide be displayed in the legend? |
shape_lab |
label of the shape aesthetic in the legend |
errorbar |
should error bars be plotted? |
errorbar_height |
height of error bars |
errorbar_coloured |
should error bars be colored as the points? |
stripped_rows |
should stripped rows be displayed in the background? |
strips_odd |
color of the odd rows |
strips_even |
color of the even rows |
vline |
should a vertical line be drawn at 0 (or 1 if
|
vline_colour |
colour of vertical line |
dodged |
should points be dodged (according to the colour aesthetic)? |
dodged_width |
width value for |
facet_row |
variable name to be used for row facets |
facet_col |
optional variable name to be used for column facets |
facet_labeller |
labeller function to be used for labeling facets;
if labels are too long, you can use |
For more control, you can use the argument return_data = TRUE
to
get the produced tibble
, apply any transformation of your own and
then pass your customized tibble
to ggcoef_plot()
.
A ggplot2
plot or a tibble
if return_data = TRUE
.
ggcoef_table()
: a variation of ggcoef_model()
adding a table
with estimates, confidence intervals and p-values
ggcoef_compare()
: designed for displaying several models on the same
plot.
ggcoef_multinom()
: a variation of ggcoef_model()
adapted to
multinomial logistic regressions performed with nnet::multinom()
.
ggcoef_multicomponents()
: a variation of ggcoef_model()
adapted to
multi-component models such as zero-inflated models or beta regressions.
ggcoef_multicomponents()
has been tested with pscl::zeroinfl()
,
pscl::hurdle()
and betareg::betareg()
ggcoef_plot()
: plot a tidy tibble
of coefficients
vignette("ggcoef_model")
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) ggcoef_model(mod) ggcoef_table(mod) ggcoef_table(mod, table_stat = c("estimate", "ci")) ggcoef_table( mod, table_stat_label = list( estimate = scales::label_number(.001) ) ) ggcoef_table(mod, table_text_size = 5, table_witdhs = c(1, 1)) # a logistic regression example d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) # use 'exponentiate = TRUE' to get the Odds Ratio ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_table(mod_titanic, exponentiate = TRUE) # display intercepts ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE) # customize terms labels ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) # display only a subset of terms ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class")) # do not change points' shape based on significance ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL) # a black and white version ggcoef_model( mod_titanic, exponentiate = TRUE, colour = NULL, stripped_rows = FALSE ) # show dichotomous terms on one row ggcoef_model( mod_titanic, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous(), categorical_terms_pattern = "{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}", show_p_values = FALSE ) data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) # custom variable labels # you can use the labelled package to define variable labels # before computing model if (requireNamespace("labelled")) { tips_labelled <- tips |> labelled::set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) } # you can provide custom variable labels with 'variable_labels' ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) # if labels are too long, you can use 'facet_labeller' to wrap them ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) # do not display variable facets but add colour guide ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) # works also with with polynomial terms mod_poly <- lm( tip ~ poly(total_bill, 3) + day, data = tips, ) ggcoef_model(mod_poly) # or with different type of contrasts # for sum contrasts, the value of the reference term is computed if (requireNamespace("emmeans")) { mod2 <- lm( tip ~ day + time + sex, data = tips, contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3)) ) ggcoef_model(mod2) } # Use ggcoef_compare() for comparing several models on the same plot mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") # you can reverse the vertical position of the point by using a negative # value for dodged_width (but it will produce some warnings) ggcoef_compare(models, dodged_width = -.9) # specific function for nnet::multinom models mod <- nnet::multinom(Species ~ ., data = iris) ggcoef_multinom(mod, exponentiate = TRUE) ggcoef_multinom(mod, type = "faceted") ggcoef_multinom( mod, type = "faceted", y.level_label = c("versicolor" = "versicolor\n(ref: setosa)") ) library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists) ggcoef_multicomponents(mod2, type = "t")
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris) ggcoef_model(mod) ggcoef_table(mod) ggcoef_table(mod, table_stat = c("estimate", "ci")) ggcoef_table( mod, table_stat_label = list( estimate = scales::label_number(.001) ) ) ggcoef_table(mod, table_text_size = 5, table_witdhs = c(1, 1)) # a logistic regression example d_titanic <- as.data.frame(Titanic) d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes")) mod_titanic <- glm( Survived ~ Sex * Age + Class, weights = Freq, data = d_titanic, family = binomial ) # use 'exponentiate = TRUE' to get the Odds Ratio ggcoef_model(mod_titanic, exponentiate = TRUE) ggcoef_table(mod_titanic, exponentiate = TRUE) # display intercepts ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE) # customize terms labels ggcoef_model( mod_titanic, exponentiate = TRUE, show_p_values = FALSE, signif_stars = FALSE, add_reference_rows = FALSE, categorical_terms_pattern = "{level} (ref: {reference_level})", interaction_sep = " x " ) + ggplot2::scale_y_discrete(labels = scales::label_wrap(15)) # display only a subset of terms ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class")) # do not change points' shape based on significance ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL) # a black and white version ggcoef_model( mod_titanic, exponentiate = TRUE, colour = NULL, stripped_rows = FALSE ) # show dichotomous terms on one row ggcoef_model( mod_titanic, exponentiate = TRUE, no_reference_row = broom.helpers::all_dichotomous(), categorical_terms_pattern = "{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}", show_p_values = FALSE ) data(tips, package = "reshape") mod_simple <- lm(tip ~ day + time + total_bill, data = tips) ggcoef_model(mod_simple) # custom variable labels # you can use the labelled package to define variable labels # before computing model if (requireNamespace("labelled")) { tips_labelled <- tips |> labelled::set_variable_labels( day = "Day of the week", time = "Lunch or Dinner", total_bill = "Bill's total" ) mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled) ggcoef_model(mod_labelled) } # you can provide custom variable labels with 'variable_labels' ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ) ) # if labels are too long, you can use 'facet_labeller' to wrap them ggcoef_model( mod_simple, variable_labels = c( day = "Week day", time = "Time (lunch or dinner ?)", total_bill = "Total of the bill" ), facet_labeller = ggplot2::label_wrap_gen(10) ) # do not display variable facets but add colour guide ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE) # works also with with polynomial terms mod_poly <- lm( tip ~ poly(total_bill, 3) + day, data = tips, ) ggcoef_model(mod_poly) # or with different type of contrasts # for sum contrasts, the value of the reference term is computed if (requireNamespace("emmeans")) { mod2 <- lm( tip ~ day + time + sex, data = tips, contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3)) ) ggcoef_model(mod2) } # Use ggcoef_compare() for comparing several models on the same plot mod1 <- lm(Fertility ~ ., data = swiss) mod2 <- step(mod1, trace = 0) mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss) models <- list( "Full model" = mod1, "Simplified model" = mod2, "With interaction" = mod3 ) ggcoef_compare(models) ggcoef_compare(models, type = "faceted") # you can reverse the vertical position of the point by using a negative # value for dodged_width (but it will produce some warnings) ggcoef_compare(models, dodged_width = -.9) # specific function for nnet::multinom models mod <- nnet::multinom(Species ~ ., data = iris) ggcoef_multinom(mod, exponentiate = TRUE) ggcoef_multinom(mod, type = "faceted") ggcoef_multinom( mod, type = "faceted", y.level_label = c("versicolor" = "versicolor\n(ref: setosa)") ) library(pscl) data("bioChemists", package = "pscl") mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists) ggcoef_multicomponents(mod) ggcoef_multicomponents(mod, type = "f") ggcoef_multicomponents(mod, type = "t") ggcoef_multicomponents( mod, type = "t", component_label = c(conditional = "Count", zero_inflated = "Zero-inflated") ) mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists) ggcoef_multicomponents(mod2, type = "t")
Combines several factor variables using the same list of ordered levels (e.g. Likert-type scales) into a unique data frame and generates a centered bar plot.
gglikert( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = totals_include_center, factor_to_sort = ".question", exclude_fill_values = NULL, cutoff = NULL, data_fun = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "auto", labels_accuracy = 1, labels_hide_below = 0.05, add_totals = TRUE, totals_size = labels_size, totals_color = "black", totals_accuracy = labels_accuracy, totals_fontface = "bold", totals_include_center = FALSE, totals_hjust = 0.1, y_reverse = TRUE, y_label_wrap = 50, reverse_likert = FALSE, width = 0.9, facet_rows = NULL, facet_cols = NULL, facet_label_wrap = 50, symmetric = FALSE ) gglikert_data( data, include = dplyr::everything(), weights = NULL, variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = TRUE, factor_to_sort = ".question", exclude_fill_values = NULL, cutoff = NULL, data_fun = NULL ) gglikert_stacked( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = FALSE, factor_to_sort = ".question", data_fun = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "auto", labels_accuracy = 1, labels_hide_below = 0.05, add_median_line = FALSE, y_reverse = TRUE, y_label_wrap = 50, reverse_fill = TRUE, width = 0.9 )
gglikert( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = totals_include_center, factor_to_sort = ".question", exclude_fill_values = NULL, cutoff = NULL, data_fun = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "auto", labels_accuracy = 1, labels_hide_below = 0.05, add_totals = TRUE, totals_size = labels_size, totals_color = "black", totals_accuracy = labels_accuracy, totals_fontface = "bold", totals_include_center = FALSE, totals_hjust = 0.1, y_reverse = TRUE, y_label_wrap = 50, reverse_likert = FALSE, width = 0.9, facet_rows = NULL, facet_cols = NULL, facet_label_wrap = 50, symmetric = FALSE ) gglikert_data( data, include = dplyr::everything(), weights = NULL, variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = TRUE, factor_to_sort = ".question", exclude_fill_values = NULL, cutoff = NULL, data_fun = NULL ) gglikert_stacked( data, include = dplyr::everything(), weights = NULL, y = ".question", variable_labels = NULL, sort = c("none", "ascending", "descending"), sort_method = c("prop", "prop_lower", "mean", "median"), sort_prop_include_center = FALSE, factor_to_sort = ".question", data_fun = NULL, add_labels = TRUE, labels_size = 3.5, labels_color = "auto", labels_accuracy = 1, labels_hide_below = 0.05, add_median_line = FALSE, y_reverse = TRUE, y_label_wrap = 50, reverse_fill = TRUE, width = 0.9 )
data |
a data frame |
include |
variables to include, accepts tidy-select syntax |
weights |
optional variable name of a weighting variable, accepts tidy-select syntax |
y |
name of the variable to be plotted on |
variable_labels |
a named list or a named vector of custom variable labels |
sort |
should the factor defined by |
sort_method |
method used to sort the variables: |
sort_prop_include_center |
when sorting with |
factor_to_sort |
name of the factor column to sort if |
exclude_fill_values |
Vector of values that should not be displayed
(but still taken into account for computing proportions),
see |
cutoff |
number of categories to be displayed negatively (i.e. on the
left of the x axis or the bottom of the y axis), could be a decimal value:
|
data_fun |
for advanced usage, custom function to be applied to the
generated dataset at the end of |
add_labels |
should percentage labels be added to the plot? |
labels_size |
size of the percentage labels |
labels_color |
color of the percentage labels ( |
labels_accuracy |
accuracy of the percentages, see
|
labels_hide_below |
if provided, values below will be masked, see
|
add_totals |
should the total proportions of negative and positive answers be added to plot? This option is not compatible with facets! |
totals_size |
size of the total proportions |
totals_color |
color of the total proportions |
totals_accuracy |
accuracy of the total proportions, see
|
totals_fontface |
font face of the total proportions |
totals_include_center |
if the number of levels is uneven, should half of the center level be added to the total proportions? |
totals_hjust |
horizontal adjustment of totals labels on the x axis |
y_reverse |
should the y axis be reversed? |
y_label_wrap |
number of characters per line for y axis labels, see
|
reverse_likert |
if |
width |
bar width, see |
facet_rows , facet_cols
|
A set of variables or expressions quoted by
|
facet_label_wrap |
number of characters per line for facet labels, see
|
symmetric |
should the x-axis be symmetric? |
add_median_line |
add a vertical line at 50%? |
reverse_fill |
if |
You could use gglikert_data()
to just produce the dataset to be plotted.
If variable labels have been defined (see labelled::var_label()
), they will
be considered. You can also pass custom variables labels with the
variable_labels
argument.
A ggplot2
plot or a tibble
.
vignette("gglikert")
, position_likert()
, stat_prop()
library(ggplot2) library(dplyr) likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) |> mutate(across(everything(), ~ factor(.x, levels = likert_levels))) gglikert(df) gglikert(df, include = q1:3) + scale_fill_likert(pal = scales::brewer_pal(palette = "PRGn")) gglikert(df, sort = "ascending") gglikert(df, sort = "ascending", sort_prop_include_center = TRUE) gglikert(df, sort = "ascending", sort_method = "mean") gglikert(df, reverse_likert = TRUE) gglikert(df, add_totals = FALSE, add_labels = FALSE) gglikert( df, totals_include_center = TRUE, totals_hjust = .25, totals_size = 4.5, totals_fontface = "italic", totals_accuracy = .01, labels_accuracy = 1, labels_size = 2.5, labels_hide_below = .25 ) gglikert(df, exclude_fill_values = "Neither agree nor disagree") if (require("labelled")) { df |> set_variable_labels( q1 = "First question", q2 = "Second question" ) |> gglikert( variable_labels = c( q4 = "a custom label", q6 = "a very very very very very very very very very very long label" ), y_label_wrap = 25 ) } # Facets df_group <- df df_group$group <- sample(c("A", "B"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_rows = vars(group)) gglikert(df_group, q1:q6, facet_cols = vars(group)) gglikert(df_group, q1:q6, y = "group", facet_rows = vars(.question)) # Custom function to be applied on data f <- function(d) { d$.question <- forcats::fct_relevel(d$.question, "q5", "q2") d } gglikert(df, include = q1:q6, data_fun = f) # Custom center gglikert(df, cutoff = 2) gglikert(df, cutoff = 1) gglikert(df, cutoff = 1, symmetric = TRUE) gglikert_stacked(df, q1:q6) gglikert_stacked(df, q1:q6, add_median_line = TRUE, sort = "asc") gglikert_stacked(df_group, q1:q6, y = "group", add_median_line = TRUE) + facet_grid(rows = vars(.question))
library(ggplot2) library(dplyr) likert_levels <- c( "Strongly disagree", "Disagree", "Neither agree nor disagree", "Agree", "Strongly agree" ) set.seed(42) df <- tibble( q1 = sample(likert_levels, 150, replace = TRUE), q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1), q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5), q5 = sample(c(likert_levels, NA), 150, replace = TRUE), q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0)) ) |> mutate(across(everything(), ~ factor(.x, levels = likert_levels))) gglikert(df) gglikert(df, include = q1:3) + scale_fill_likert(pal = scales::brewer_pal(palette = "PRGn")) gglikert(df, sort = "ascending") gglikert(df, sort = "ascending", sort_prop_include_center = TRUE) gglikert(df, sort = "ascending", sort_method = "mean") gglikert(df, reverse_likert = TRUE) gglikert(df, add_totals = FALSE, add_labels = FALSE) gglikert( df, totals_include_center = TRUE, totals_hjust = .25, totals_size = 4.5, totals_fontface = "italic", totals_accuracy = .01, labels_accuracy = 1, labels_size = 2.5, labels_hide_below = .25 ) gglikert(df, exclude_fill_values = "Neither agree nor disagree") if (require("labelled")) { df |> set_variable_labels( q1 = "First question", q2 = "Second question" ) |> gglikert( variable_labels = c( q4 = "a custom label", q6 = "a very very very very very very very very very very long label" ), y_label_wrap = 25 ) } # Facets df_group <- df df_group$group <- sample(c("A", "B"), 150, replace = TRUE) gglikert(df_group, q1:q6, facet_rows = vars(group)) gglikert(df_group, q1:q6, facet_cols = vars(group)) gglikert(df_group, q1:q6, y = "group", facet_rows = vars(.question)) # Custom function to be applied on data f <- function(d) { d$.question <- forcats::fct_relevel(d$.question, "q5", "q2") d } gglikert(df, include = q1:q6, data_fun = f) # Custom center gglikert(df, cutoff = 2) gglikert(df, cutoff = 1) gglikert(df, cutoff = 1, symmetric = TRUE) gglikert_stacked(df, q1:q6) gglikert_stacked(df, q1:q6, add_median_line = TRUE, sort = "asc") gglikert_stacked(df_group, q1:q6, y = "group", add_median_line = TRUE) + facet_grid(rows = vars(.question))
A function to facilitate ggplot2
graphs using a survey object.
It will initiate a ggplot and map survey weights to the
corresponding aesthetic.
ggsurvey(design = NULL, mapping = NULL, ...)
ggsurvey(design = NULL, mapping = NULL, ...)
design |
A survey design object, usually created with
|
mapping |
Default list of aesthetic mappings to use for plot,
to be created with |
... |
Other arguments passed on to methods. Not currently used. |
Graphs will be correct as long as only weights are required
to compute the graph. However, statistic or geometry requiring
correct variance computation (like ggplot2::geom_smooth()
) will
be statistically incorrect.
A ggplot2
plot.
data(api, package = "survey") dstrat <- survey::svydesign( id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc ) ggsurvey(dstrat) + ggplot2::aes(x = cnum, y = dnum) + ggplot2::geom_count() d <- as.data.frame(Titanic) dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d) ggsurvey(dw) + ggplot2::aes(x = Class, fill = Survived) + ggplot2::geom_bar(position = "fill")
data(api, package = "survey") dstrat <- survey::svydesign( id = ~1, strata = ~stype, weights = ~pw, data = apistrat, fpc = ~fpc ) ggsurvey(dstrat) + ggplot2::aes(x = cnum, y = dnum) + ggplot2::geom_count() d <- as.data.frame(Titanic) dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d) ggsurvey(dw) + ggplot2::aes(x = Class, fill = Survived) + ggplot2::geom_bar(position = "fill")
You could use auto_contrast
as a shortcut of
aes(colour = after_scale(hex_bw(.data$fill)))
. You should use !!!
to
inject it within ggplot2::aes()
(see examples).
hex_bw_threshold()
is a variation of hex_bw()
. For values
below
threshold
, black ("#000000"
) will always be returned, regardless of
hex_code
.
hex_bw(hex_code) hex_bw_threshold(hex_code, values, threshold) auto_contrast
hex_bw(hex_code) hex_bw_threshold(hex_code, values, threshold) auto_contrast
hex_code |
Background color in hex-format. |
values |
Values to be compared. |
threshold |
Threshold. |
An object of class uneval
of length 1.
Either black or white, in hex-format
Adapted from saros
for hex_code()
and from
https://github.com/teunbrand/ggplot_tricks?tab=readme-ov-file#text-contrast
for auto_contrast
.
hex_bw("#0dadfd") library(ggplot2) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count)) + geom_bar() + geom_text( mapping = aes(color = after_scale(hex_bw(.data$fill))), position = position_stack(.5), stat = "count", size = 2 ) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count)) + geom_bar() + geom_text( mapping = auto_contrast, position = position_stack(.5), stat = "count", size = 2 ) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count), !!!auto_contrast) + geom_bar() + geom_text( mapping = auto_contrast, position = position_stack(.5), stat = "count", size = 2 )
hex_bw("#0dadfd") library(ggplot2) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count)) + geom_bar() + geom_text( mapping = aes(color = after_scale(hex_bw(.data$fill))), position = position_stack(.5), stat = "count", size = 2 ) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count)) + geom_bar() + geom_text( mapping = auto_contrast, position = position_stack(.5), stat = "count", size = 2 ) ggplot(diamonds) + aes(x = cut, fill = color, label = after_stat(count), !!!auto_contrast) + geom_bar() + geom_text( mapping = auto_contrast, position = position_stack(.5), stat = "count", size = 2 )
Label absolute values
label_number_abs(..., hide_below = NULL) label_percent_abs(..., hide_below = NULL)
label_number_abs(..., hide_below = NULL) label_percent_abs(..., hide_below = NULL)
... |
arguments passed to |
hide_below |
if provided, values below |
A "labelling" function, , i.e. a function that takes a vector and returns a character vector of same length giving a label for each input value.
scales::label_number()
, scales::label_percent()
x <- c(-0.2, -.05, 0, .07, .25, .66) scales::label_number()(x) label_number_abs()(x) scales::label_percent()(x) label_percent_abs()(x) label_percent_abs(hide_below = .1)(x)
x <- c(-0.2, -.05, 0, .07, .25, .66) scales::label_number()(x) label_number_abs()(x) scales::label_percent()(x) label_percent_abs()(x) label_percent_abs(hide_below = .1)(x)
If the palette returns less colours than requested, the list of colours
will be expanded using scales::pal_gradient_n()
. To be used with a
sequential or diverging palette. Not relevant for qualitative palettes.
pal_extender(pal = scales::brewer_pal(palette = "BrBG")) scale_fill_extended( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), aesthetics = "fill" ) scale_colour_extended( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), aesthetics = "colour" )
pal_extender(pal = scales::brewer_pal(palette = "BrBG")) scale_fill_extended( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), aesthetics = "fill" ) scale_colour_extended( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), aesthetics = "colour" )
pal |
A palette function, such as returned by scales::brewer_pal, taking a number of colours as entry and returning a list of colours. |
name |
The name of the scale. Used as the axis or legend title.
If |
... |
Other arguments passed on to |
aesthetics |
Character string or vector of character strings listing
the name(s) of the aesthetic(s) that this scale works with. This can be
useful, for example, to apply colour settings to the colour and fill
aesthetics at the same time, via |
A palette function.
pal <- scales::pal_brewer(palette = "PiYG") scales::show_col(pal(16)) scales::show_col(pal_extender(pal)(16))
pal <- scales::pal_brewer(palette = "PiYG") scales::show_col(pal(16)) scales::show_col(pal_extender(pal)(16))
position_diverging()
stacks bars on top of each other and
center them around zero (the same number of categories are displayed on
each side).
position_likert()
uses proportions instead of counts. This type of
presentation is commonly used to display Likert-type scales.
position_likert( vjust = 1, reverse = FALSE, exclude_fill_values = NULL, cutoff = NULL ) position_diverging( vjust = 1, reverse = FALSE, exclude_fill_values = NULL, cutoff = NULL )
position_likert( vjust = 1, reverse = FALSE, exclude_fill_values = NULL, cutoff = NULL ) position_diverging( vjust = 1, reverse = FALSE, exclude_fill_values = NULL, cutoff = NULL )
vjust |
Vertical adjustment for geoms that have a position
(like points or lines), not a dimension (like bars or areas). Set to
|
reverse |
If |
exclude_fill_values |
Vector of values from the variable associated with
the |
cutoff |
number of categories to be displayed negatively (i.e. on the
left of the x axis or the bottom of the y axis), could be a decimal value:
|
It is recommended to use position_likert()
with stat_prop()
and its complete
argument (see examples).
See ggplot2::position_stack()
and ggplot2::position_fill()
library(ggplot2) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "fill") + scale_x_continuous(label = scales::label_percent()) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "stack") + scale_fill_likert(pal = scales::brewer_pal(palette = "PiYG")) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "diverging") + scale_x_continuous(label = label_number_abs()) + scale_fill_likert() # Reverse order ------------------------------------------------------------- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(reverse = TRUE)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") # Custom center ------------------------------------------------------------- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 1)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert(cutoff = 1) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 3.75)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert(cutoff = 3.75) + xlab("proportion") # Missing items ------------------------------------------------------------- # example with a level not being observed for a specific value of y d <- diamonds d <- d[!(d$cut == "Premium" & d$clarity == "I1"), ] d <- d[!(d$cut %in% c("Fair", "Good") & d$clarity == "SI2"), ] # by default, the two lowest bar are not properly centered ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_fill_likert() # use stat_prop() with `complete = "fill"` to fix it ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert", stat = "prop", complete = "fill") + scale_fill_likert() # Add labels ---------------------------------------------------------------- custom_label <- function(x) { p <- scales::percent(x, accuracy = 1) p[x < .075] <- "" p } ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + geom_text( aes(by = clarity, label = custom_label(after_stat(prop))), stat = "prop", position = position_likert(vjust = .5) ) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") # Do not display specific fill values --------------------------------------- # (but taken into account to compute proportions) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(exclude_fill_values = "Very Good")) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion")
library(ggplot2) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "fill") + scale_x_continuous(label = scales::label_percent()) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "stack") + scale_fill_likert(pal = scales::brewer_pal(palette = "PiYG")) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "diverging") + scale_x_continuous(label = label_number_abs()) + scale_fill_likert() # Reverse order ------------------------------------------------------------- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(reverse = TRUE)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") # Custom center ------------------------------------------------------------- ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 1)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert(cutoff = 1) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 3.75)) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert(cutoff = 3.75) + xlab("proportion") # Missing items ------------------------------------------------------------- # example with a level not being observed for a specific value of y d <- diamonds d <- d[!(d$cut == "Premium" & d$clarity == "I1"), ] d <- d[!(d$cut %in% c("Fair", "Good") & d$clarity == "SI2"), ] # by default, the two lowest bar are not properly centered ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_fill_likert() # use stat_prop() with `complete = "fill"` to fix it ggplot(d) + aes(y = clarity, fill = cut) + geom_bar(position = "likert", stat = "prop", complete = "fill") + scale_fill_likert() # Add labels ---------------------------------------------------------------- custom_label <- function(x) { p <- scales::percent(x, accuracy = 1) p[x < .075] <- "" p } ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + geom_text( aes(by = clarity, label = custom_label(after_stat(prop))), stat = "prop", position = position_likert(vjust = .5) ) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion") # Do not display specific fill values --------------------------------------- # (but taken into account to compute proportions) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(exclude_fill_values = "Very Good")) + scale_x_continuous(label = label_percent_abs()) + scale_fill_likert() + xlab("proportion")
Round to multiple of any number.
round_any(x, accuracy, f = round)
round_any(x, accuracy, f = round)
x |
numeric or date-time (POSIXct) vector to round |
accuracy |
number to round to; for POSIXct objects, a number of seconds |
f |
adapted from plyr
round_any(1.865, accuracy = .25)
round_any(1.865, accuracy = .25)
This scale is similar to other diverging discrete colour scales, but allows
to change the "center" of the scale using cutoff
argument, as used by
position_likert()
.
scale_fill_likert( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), cutoff = NULL, aesthetics = "fill" ) likert_pal(pal = scales::brewer_pal(palette = "BrBG"), cutoff = NULL)
scale_fill_likert( name = waiver(), ..., pal = scales::brewer_pal(palette = "BrBG"), cutoff = NULL, aesthetics = "fill" ) likert_pal(pal = scales::brewer_pal(palette = "BrBG"), cutoff = NULL)
name |
The name of the scale. Used as the axis or legend title.
If |
... |
Other arguments passed on to |
pal |
A palette function taking a number of colours as entry and returning a list of colours (see examples), ideally a diverging palette |
cutoff |
Number of categories displayed negatively (see
|
aesthetics |
Character string or vector of character strings listing
the name(s) of the aesthetic(s) that this scale works with. This can be
useful, for example, to apply colour settings to the colour and fill
aesthetics at the same time, via |
library(ggplot2) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") + scale_fill_likert() ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 1)) + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") + scale_fill_likert(cutoff = 1)
library(ggplot2) ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = "likert") + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") + scale_fill_likert() ggplot(diamonds) + aes(y = clarity, fill = cut) + geom_bar(position = position_likert(cutoff = 1)) + scale_x_continuous(label = label_percent_abs()) + xlab("proportion") + scale_fill_likert(cutoff = 1)
Calculate significance stars
signif_stars(x, three = 0.001, two = 0.01, one = 0.05, point = 0.1)
signif_stars(x, three = 0.001, two = 0.01, one = 0.05, point = 0.1)
x |
numeric values that will be compared to the |
three |
threshold below which to display three stars |
two |
threshold below which to display two stars |
one |
threshold below which to display one star |
point |
threshold below which to display one point
( |
Character vector containing the appropriate number of
stars for each x
value.
Joseph Larmarange
x <- c(0.5, 0.1, 0.05, 0.01, 0.001) signif_stars(x) signif_stars(x, one = .15, point = NULL)
x <- c(0.5, 0.1, 0.05, 0.01, 0.001) signif_stars(x) signif_stars(x, one = .15, point = NULL)
Computes statistics of a 2-dimensional matrix using broom::augment.htest.
stat_cross( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = TRUE, show.legend = NA, inherit.aes = TRUE, keep.zero.cells = FALSE )
stat_cross( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = TRUE, show.legend = NA, inherit.aes = TRUE, keep.zero.cells = FALSE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
keep.zero.cells |
If |
A ggplot2
plot with the added statistic.
stat_cross()
requires the x and the y aesthetics.
number of observations in x,y
proportion of total
row proportion
column proportion
expected count under the null hypothesis
Pearson's residual
standardized residual
total number of observations within row
total number of observations within column
total number of observations within the table
phi coefficients, see augment_chisq_add_phi()
vignette("stat_cross")
library(ggplot2) d <- as.data.frame(Titanic) # plot number of observations ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) # custom shape and fill colour based on chi-squared residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) # custom shape and fill colour based on phi coeffients ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(phi) ) + stat_cross(shape = 22) + scale_fill_steps2(show.limits = TRUE) + scale_size_area(max_size = 20) # plotting the number of observations as a table ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = after_stat(observed) ) + geom_text(stat = "cross") # Row proportions with standardized residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(row.prop)), size = NULL, fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(Sex ~ .) + labs(fill = "Standardized residuals") + theme_minimal()
library(ggplot2) d <- as.data.frame(Titanic) # plot number of observations ggplot(d) + aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) + stat_cross() + scale_size_area(max_size = 20) # custom shape and fill colour based on chi-squared residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(std.resid) ) + stat_cross(shape = 22) + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + scale_size_area(max_size = 20) # custom shape and fill colour based on phi coeffients ggplot(d) + aes( x = Class, y = Survived, weight = Freq, size = after_stat(observed), fill = after_stat(phi) ) + stat_cross(shape = 22) + scale_fill_steps2(show.limits = TRUE) + scale_size_area(max_size = 20) # plotting the number of observations as a table ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = after_stat(observed) ) + geom_text(stat = "cross") # Row proportions with standardized residuals ggplot(d) + aes( x = Class, y = Survived, weight = Freq, label = scales::percent(after_stat(row.prop)), size = NULL, fill = after_stat(std.resid) ) + stat_cross(shape = 22, size = 30) + geom_text(stat = "cross") + scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) + facet_grid(Sex ~ .) + labs(fill = "Standardized residuals") + theme_minimal()
stat_prop()
is a variation of ggplot2::stat_count()
allowing to
compute custom proportions according to the by aesthetic defining
the denominator (i.e. all proportions for a same value of by will
sum to 1). If the by aesthetic is not specified, denominators will be
determined according to the default_by
argument.
stat_prop( mapping = NULL, data = NULL, geom = "bar", position = "fill", ..., width = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, complete = NULL, default_by = "total" )
stat_prop( mapping = NULL, data = NULL, geom = "bar", position = "fill", ..., width = NULL, na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE, complete = NULL, default_by = "total" )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
width |
Bar width. By default, set to 90% of the |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
complete |
Name (character) of an aesthetic for those statistics should be completed for unobserved values (see example). |
default_by |
If the by aesthetic is not available, name of another
aesthetic that will be used to determine the denominators (e.g. |
A ggplot2
plot with the added statistic.
stat_prop()
understands the following aesthetics
(required aesthetics are in bold):
x or y
by
weight
after_stat(count)
number of points in bin
after_stat(denominator)
denominator for the proportions
after_stat(prop)
computed proportion, i.e.
after_stat(count)
/after_stat(denominator)
vignette("stat_prop")
, ggplot2::stat_count()
. For an alternative
approach, see
https://github.com/tidyverse/ggplot2/issues/5505#issuecomment-1791324008.
library(ggplot2) d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p p + facet_grid(~Sex) ggplot(d) + aes(x = Class, fill = Survived, weight = Freq) + geom_bar(position = "dodge") + geom_text( aes(by = Survived), stat = "prop", position = position_dodge(0.9), vjust = "bottom" ) if (requireNamespace("scales")) { ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) } # displaying unobserved levels with complete d <- diamonds |> dplyr::filter(!(cut == "Ideal" & clarity == "I1")) |> dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) |> dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text(stat = "prop", position = position_fill(.5)) p + geom_text(stat = "prop", position = position_fill(.5), complete = "fill")
library(ggplot2) d <- as.data.frame(Titanic) p <- ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = Class) + geom_bar(position = "fill") + geom_text(stat = "prop", position = position_fill(.5)) p p + facet_grid(~Sex) ggplot(d) + aes(x = Class, fill = Survived, weight = Freq) + geom_bar(position = "dodge") + geom_text( aes(by = Survived), stat = "prop", position = position_dodge(0.9), vjust = "bottom" ) if (requireNamespace("scales")) { ggplot(d) + aes(x = Class, fill = Survived, weight = Freq, by = 1) + geom_bar() + geom_text( aes(label = scales::percent(after_stat(prop), accuracy = 1)), stat = "prop", position = position_stack(.5) ) } # displaying unobserved levels with complete d <- diamonds |> dplyr::filter(!(cut == "Ideal" & clarity == "I1")) |> dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) |> dplyr::filter(!(cut == "Premium" & clarity == "IF")) p <- ggplot(d) + aes(x = clarity, fill = cut, by = clarity) + geom_bar(position = "fill") p + geom_text(stat = "prop", position = position_fill(.5)) p + geom_text(stat = "prop", position = position_fill(.5), complete = "fill")
This statistic will compute the mean of y aesthetic for each unique value of x, taking into account weight aesthetic if provided.
stat_weighted_mean( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
stat_weighted_mean( mapping = NULL, data = NULL, geom = "point", position = "identity", ..., na.rm = FALSE, orientation = NA, show.legend = NA, inherit.aes = TRUE )
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
A ggplot2
plot with the added statistic.
weighted y (numerator / denominator)
numerator
denominator
vignette("stat_weighted_mean")
library(ggplot2) data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = total_bill) + geom_point() ggplot(tips) + aes(x = day, y = total_bill) + stat_weighted_mean() ggplot(tips) + aes(x = day, y = total_bill, group = 1) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, colour = sex, group = sex) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") # computing a proportion on the fly if (requireNamespace("scales")) { ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) } library(ggplot2) # taking into account some weights if (requireNamespace("scales")) { d <- as.data.frame(Titanic) ggplot(d) + aes( x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex ) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Survived") }
library(ggplot2) data(tips, package = "reshape") ggplot(tips) + aes(x = day, y = total_bill) + geom_point() ggplot(tips) + aes(x = day, y = total_bill) + stat_weighted_mean() ggplot(tips) + aes(x = day, y = total_bill, group = 1) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, colour = sex, group = sex) + stat_weighted_mean(geom = "line") ggplot(tips) + aes(x = day, y = total_bill, fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") # computing a proportion on the fly if (requireNamespace("scales")) { ggplot(tips) + aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) + stat_weighted_mean(geom = "bar", position = "dodge") + scale_y_continuous(labels = scales::percent) } library(ggplot2) # taking into account some weights if (requireNamespace("scales")) { d <- as.data.frame(Titanic) ggplot(d) + aes( x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex ) + geom_bar(stat = "weighted_mean", position = "dodge") + scale_y_continuous(labels = scales::percent) + labs(y = "Survived") }
Expand scale limits to make them symmetric around zero.
Can be passed as argument to parameter limits
of continuous scales from
packages {ggplot2}
or {scales}
. Can be also used to obtain an enclosing
symmetric range for numeric vectors.
symmetric_limits(x)
symmetric_limits(x)
x |
a vector of numeric values, possibly a range, from which to compute enclosing range |
A numeric vector of length two with the new limits, which are always such that the absolute value of upper and lower limits is the same.
Adapted from the homonym function in {ggpmisc}
library(ggplot2) ggplot(iris) + aes(x = Sepal.Length - 5, y = Sepal.Width - 3, colour = Species) + geom_vline(xintercept = 0) + geom_hline(yintercept = 0) + geom_point() last_plot() + scale_x_continuous(limits = symmetric_limits) + scale_y_continuous(limits = symmetric_limits)
library(ggplot2) ggplot(iris) + aes(x = Sepal.Length - 5, y = Sepal.Width - 3, colour = Species) + geom_vline(xintercept = 0) + geom_hline(yintercept = 0) + geom_point() last_plot() + scale_x_continuous(limits = symmetric_limits) + scale_y_continuous(limits = symmetric_limits)
Compute the median or quantiles a set of numbers which have weights associated with them.
weighted.median(x, w, na.rm = TRUE, type = 2) weighted.quantile(x, w, probs = seq(0, 1, 0.25), na.rm = TRUE, type = 4)
weighted.median(x, w, na.rm = TRUE, type = 2) weighted.quantile(x, w, probs = seq(0, 1, 0.25), na.rm = TRUE, type = 4)
x |
a numeric vector of values |
w |
a numeric vector of weights |
na.rm |
a logical indicating whether to ignore |
type |
Integer specifying the rule for calculating the median or
quantile, corresponding to the rules available for |
probs |
probabilities for which the quantiles should be computed, a numeric vector of values between 0 and 1 |
The i
th observation x[i]
is treated as having a weight proportional to
w[i]
.
The weighted median is a value m
such that the total weight of data less
than or equal to m
is equal to half the total weight. More generally, the
weighted quantile with probability p
is a value q
such that the total
weight of data less than or equal to q
is equal to p
times the total
weight.
If there is no such value, then
if type = 1
, the next largest value is returned (this is the
right-continuous inverse of the left-continuous cumulative distribution
function);
if type = 2
, the average of the two surrounding values is returned
(the average of the right-continuous and left-continuous inverses);
if type = 4
, linear interpolation is performed.
Note that the default rule for weighted.median()
is type = 2
, consistent
with the traditional definition of the median, while the default for
weighted.quantile()
is type = 4
.
A numeric vector.
These functions are adapted from their homonyms developed by Adrian
Baddeley in the spatstat
package.
x <- 1:20 w <- runif(20) weighted.median(x, w) weighted.quantile(x, w)
x <- 1:20 w <- runif(20) weighted.median(x, w) weighted.quantile(x, w)
Weighted Sum
weighted.sum(x, w, na.rm = TRUE)
weighted.sum(x, w, na.rm = TRUE)
x |
a numeric vector of values |
w |
a numeric vector of weights |
na.rm |
a logical indicating whether to ignore |
A numeric vector.
x <- 1:20 w <- runif(20) weighted.sum(x, w)
x <- 1:20 w <- runif(20) weighted.sum(x, w)