You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have found that there are very many situations where it is useful to update both the labels and variable values that are used in a ggplot plot using a function mapping character() -> character(). A very typical use case is when a variable is named total_population but the axis label should read Total population. Currently some suggested solutions I have seen/used are:
Setting labels manually, which is very flexible, but needs to be repeated for every label in every plot, and can lead to errors if a variable is replaced in a plot but one inadvertently neglects updating the label.
Renaming variables prior to plotting. Although this is possible, it requires the use of quotes in the case of spaces, and can require updating the variable name in many places in generating the plot in the case when one simply wants to change the displayed text on the plot.
To address this issue, I have written a function that takes a ggplot object and an arbitrary function that maps one character vector to another. It then applies the function to text elements of the plot. The function also takes arguments to determine if the function should be applied to labels used in the plot, to variables (works on factors and strings), and allows choosing subsets of either labels or variables. An ellipsis argument allows additional parameters to be passed to the mapping function.
A reprex demonstrating the use of this function to plot mass against height using the starwars data set is included below.
The feature request is to add this function or an adaptation of it to the ggplot2 package. I should note that due to limited familiarity with the ggplot object system, this function is not currently implemented as a standard ggplot function that is added to a plot using +. Instead, it has to be used separately, using a pipe or direct function call. It would probably be desireable to rewrite it so that it fit with regular usage.
I have searched widely for a solution for this problem, in case there is already a good way to do this, then my apologies for the noise :-)
library(tidyverse)
library(snakecase)
# The proposed functiongg_apply<-function(p, fun, ..., .labs=TRUE, .vars=TRUE) {
# Calculate new label values, and test that fun returns a sane resultlabels_new<- lapply(p$labels, fun, ...)
stopifnot(
all(sapply(labels_new, is.character)),
length(labels_new) == length(p$labels)
)
# If .labs is true, we replace the labels in pif ( isTRUE(.labs) ) {
.labs<- names(p$labels)
}
if ( isFALSE(.labs) || is.null(.labs) || all(is.na(.labs)) ) {
.labs<-character()
}
stopifnot(is.character(.labs))
# .labs is now a character vector of labels to replacefor ( lab_namein.labs ) {
p$labels[[lab_name]] <-labels_new[[lab_name]]
}
# Process non-character list values for labs.# If neither of these conditions is true, vars MUST be a# character vector, or we bail.if ( isTRUE(.vars) ) {
.vars<- names(p$data)
}
if ( isFALSE(.vars) || is.null(.vars) || all(is.na(.vars)) ) {
.vars<-character()
}
stopifnot(is.character(.vars))
# .vars is now a character vector of variables to replacefor ( var_namein.vars ) {
# Process a character variable and do some sanity testingif ( is.character(p$data[[var_name]]) ) {
var_new<- fun(p$data[[var_name]], ...)
stopifnot(
is.character(var_new),
length(var_new) == length(p$data[[var_name]])
)
p$data[[var_name]] <-var_new
}
# Process a factor variable and do some sanity testingif ( is.factor(p$data[[var_name]]) ) {
var_fct_old<-p$data[[var_name]]
var_chr_new<- fun(as.character(var_fct_old), ...)
levels_new<- fun(levels(var_fct_old), ...)
stopifnot(
is.character(var_chr_new),
length(var_chr_new) == length(var_fct_old)
)
p$data[[var_name]] <-factor(var_chr_new, levels=levels_new)
}
}
p
}
# Example usage of the functionp<-starwars %>%
filter(mass<1000) %>%
mutate(species=species %>% fct_infreq %>% fct_lump(5) %>% fct_explicit_na) %>%
ggplot(aes(height, mass, color=species, size=birth_year)) +
geom_point()
p %>% gg_apply(snakecase::to_sentence_case)
#> Warning: Removed 23 rows containing missing values (geom_point).
In the plot, note that all the labels are formatted using sentence case, as one would expect in a publication. Also note that any function can be used. For example, to create representations of a plot in multiple languages, one could use a lookup function that maps variable names to different language representations.
The text was updated successfully, but these errors were encountered:
I have found that there are very many situations where it is useful to update both the labels and variable values that are used in a
ggplot
plot using a function mappingcharacter()
->character()
. A very typical use case is when a variable is namedtotal_population
but the axis label should read Total population. Currently some suggested solutions I have seen/used are:To address this issue, I have written a function that takes a
ggplot
object and an arbitrary function that maps onecharacter
vector to another. It then applies the function to text elements of the plot. The function also takes arguments to determine if the function should be applied to labels used in the plot, to variables (works on factors and strings), and allows choosing subsets of either labels or variables. An ellipsis argument allows additional parameters to be passed to the mapping function.A reprex demonstrating the use of this function to plot mass against height using the
starwars
data set is included below.The feature request is to add this function or an adaptation of it to the
ggplot2
package. I should note that due to limited familiarity with the ggplot object system, this function is not currently implemented as a standardggplot
function that is added to a plot using+
. Instead, it has to be used separately, using a pipe or direct function call. It would probably be desireable to rewrite it so that it fit with regular usage.I have searched widely for a solution for this problem, in case there is already a good way to do this, then my apologies for the noise :-)
Created on 2022-02-07 by the reprex package (v2.0.1)
In the plot, note that all the labels are formatted using sentence case, as one would expect in a publication. Also note that any function can be used. For example, to create representations of a plot in multiple languages, one could use a lookup function that maps variable names to different language representations.
The text was updated successfully, but these errors were encountered: