Skip to content

boxplot computation fails when setting scale limits #3548

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cpsievert opened this issue Oct 1, 2019 · 7 comments · Fixed by #3555
Closed

boxplot computation fails when setting scale limits #3548

cpsievert opened this issue Oct 1, 2019 · 7 comments · Fixed by #3555
Milestone

Comments

@cpsievert
Copy link
Contributor

Before 10fa001, this code produced the following image:

ggplot(PlantGrowth, aes(y = weight)) +
  geom_boxplot() +
  scale_y_continuous(limits = c(5, 7.5))

image

After 10fa001, it now produces a blank plot with this warning:

Warning message:
Computation failed in `stat_boxplot()`:
missing values and NaN's not allowed if 'na.rm' is FALSE

cc @thomasp85

@rudeboybert
Copy link

I believe the issue is more fundamental than scale limits. Consider the following code:

library(ggplot2)
library(nycflights13)

ggplot(data = weather, mapping = aes(x = factor(month), y = temp)) +
  geom_boxplot()

Using ggplot2_3.2.1 yields

whereas using ggplot2_3.2.1.9000 yields


along with the same error message as above:

#> Warning: Computation failed in `stat_boxplot()`:
#> missing values and NaN's not allowed if 'na.rm' is FALSE

cc @ismayc

@clauswilke
Copy link
Member

I think I found the problem. It's in this line:

vars = "x",

It should read vars = "y".

Reprex:

library(ggplot2)
library(rlang)
library(nycflights13)

# doesn't work
ggplot(data = weather, mapping = aes(x = factor(month), y = temp)) +
  geom_boxplot()
#> Warning: Computation failed in `stat_boxplot()`:
#> missing values and NaN's not allowed if 'na.rm' is FALSE

# fix setup_data() function of StatBoxplot
StatBoxplot$setup_data <- function(data, params) {
  data <- ggplot2:::flip_data(data, params$flipped_aes)
  data$x <- data$x %||% 0
  data <- ggplot2:::remove_missing(
    data,
    na.rm = FALSE,
    vars = "y",
    name = "stat_boxplot"
  )
  flip_data(data, params$flipped_aes)
}

# now it works
ggplot(data = weather, mapping = aes(x = factor(month), y = temp)) +
  geom_boxplot()
#> Warning: Removed 1 rows containing missing values (stat_boxplot).

Created on 2019-10-06 by the reprex package (v0.3.0)

cc @thomasp85

@clauswilke
Copy link
Member

Thomas, while we're at it, is this line correct, or should it read na.rm = params$na.rm?

na.rm = FALSE,

It looks like with the current settings it's impossible to suppress the warning about missing values.

library(ggplot2)
library(rlang)
library(nycflights13)

# fix setup_data() function of StatBoxplot
StatBoxplot$setup_data <- function(data, params) {
  data <- ggplot2:::flip_data(data, params$flipped_aes)
  data$x <- data$x %||% 0
  data <- ggplot2:::remove_missing(
    data,
    na.rm = FALSE,
    vars = "y",
    name = "stat_boxplot"
  )
  flip_data(data, params$flipped_aes)
}

# warning doesn't get suppressed
ggplot(data = weather, mapping = aes(x = factor(month), y = temp)) +
  geom_boxplot(na.rm = TRUE)
#> Warning: Removed 1 rows containing missing values (stat_boxplot).

StatBoxplot$setup_data <- function(data, params) {
  data <- ggplot2:::flip_data(data, params$flipped_aes)
  data$x <- data$x %||% 0
  data <- ggplot2:::remove_missing(
    data,
    na.rm = params$na.rm,
    vars = "y",
    name = "stat_boxplot"
  )
  flip_data(data, params$flipped_aes)
}

# now the warning gets suppressed
ggplot(data = weather, mapping = aes(x = factor(month), y = temp)) +
  geom_boxplot(na.rm = TRUE)

Created on 2019-10-06 by the reprex package (v0.3.0)

@thomasp85
Copy link
Member

@clauswilke I don't know if it was intended that way (hardcoding the na.rm). @karawoo do you remember.

As for the cause of the original issue, it is actually much more complicated and is due to the new x|y notation in required_aes - will figure out a fix

@karawoo
Copy link
Member

karawoo commented Oct 7, 2019

I think hardcoding na.rm was just an oversight, I'll fix it

@rudeboybert
Copy link

Thank you for fixing this so promptly. I would like to give credit to my student Nouhaila @nouny for finding the error with the temp vs month boxplot.

@lock
Copy link

lock bot commented Apr 15, 2020

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Apr 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants