text is misplaced with position_dodge() #3022

slowkow · 2018-12-02T22:18:37Z

In the example below, I would expect all of the text labels to be positioned perfectly on top of the data points. Instead, some of the text labels are not positioned correctly.

I think the issue is due to position_dodge(). I'm not sure exactly where to look to find the relevant code.

In the last example, I use ggrepel to help illustrate the problem more clearly. You can see the blue labels 34 and 290 are not pointing to the correct positions. It seems like they're pointing to the "undodged" positions instead of the "dodged" positions.

This issue was originally reported by @raviselker in ggrepel issues: slowkow/ggrepel#122

library(tidyverse)
library(ggrepel)
# remotes::install_github("thomasp85/patchwork)
library(patchwork)

set.seed(1337)

df <- tibble(
  x = rnorm(500),
  g1 = factor(sample(c("A", "B"), 500, replace = TRUE)),
  g2 = factor(sample(c("A", "B"), 500, replace = TRUE)),
  rownames = 1:500
)

is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}

df_outliers <- df %>% group_by(g1, g2) %>% mutate(outlier = is_outlier(x))

p1 <- ggplot(df_outliers, aes(x = g1, y = x, fill = g2)) +
  geom_boxplot(width = 0.3, position = position_dodge(0.5))

p2 <- p1 +
  geom_text(
    data = . %>% filter(outlier),
    mapping = aes(label = rownames),
    position = position_dodge(0.5)
  )

p1 + p2

ggplot(df_outliers, aes(x = g1, y = x, fill = g2)) +
  geom_boxplot(width = 0.3, position = position_dodge(0.5)) +
  ggrepel::geom_label_repel(
    min.segment.length = 0,
    data = . %>% filter(outlier),
    mapping = aes(label = rownames),
    position = position_dodge(0.5)
  )

^{Created on 2018-12-02 by the reprex package (v0.2.1)}

The text was updated successfully, but these errors were encountered:

clauswilke · 2018-12-02T23:16:13Z

The underlying principle is that dodging doesn't work as one might expect when some data groupings don't exist.

library(ggplot2)
df <- data.frame(
  x = c("A", "A", "B"),
  type = c("a", "b", "a")
)

ggplot(df, aes(x, 1, color = type)) +
  geom_point(position = position_dodge(width = .5), size = 5)

^{Created on 2018-12-02 by the reprex package (v0.2.1)}

I'm not sure this can be fixed with the current positioning approach, because the position adjustments never see the entire dataset. The question is whether we can come up with some delicate surgery that fixes this problem without completely changing how position adjustments work.

clauswilke · 2018-12-02T23:22:54Z

Maybe I spoke too soon. It appears that the various position functions do receive the entire dataset, at least the dataset per panel:

ggplot2/R/position-.r

Lines 16 to 34 in 5e4a6ef

    
           #'   - `compute_layer(self, data, params, panel)` is called once 
        
           #'     per layer. `panel` is currently an internal data structure, so 
        
           #'     this method should not be overridden. 
        
           #' 
        
           #'   - `compute_panel(self, data, params, panel)` is called once per 
        
           #'     panel and should return a modified data frame. 
        
           #' 
        
           #'     `data` is a data frame containing the variables named according 
        
           #'     to the aesthetics that they're mapped to. `scales` is a list 
        
           #'     containing the `x` and `y` scales. There functions are called 
        
           #'     before the facets are trained, so they are global scales, not local 
        
           #'     to the individual panels. `params` contains the parameters returned by 
        
           #'     `setup_params()`. 
        
           #'   - `setup_params(data, params)`: called once for each layer. 
        
           #'      Used to setup defaults that need to complete dataset, and to inform 
        
           #'      the user of important choices. Should return list of parameters. 
        
           #'   - `setup_data(data, params)`: called once for each layer, 
        
           #'      after `setup_params()`. Should return modified `data`. 
        
           #'      Default checks that required aesthetics are present.

So this should be fixable. The relevant code is here:

ggplot2/R/position-dodge.r

Lines 117 to 156 in 23a23cd

    
             compute_panel = function(data, params, scales) { 
        
               collide( 
        
                 data, 
        
                 params$width, 
        
                 name = "position_dodge", 
        
                 strategy = pos_dodge, 
        
                 n = params$n, 
        
                 check.width = FALSE 
        
               ) 
        
             } 
        
           ) 
        
           # Dodge overlapping interval. 
        
           # Assumes that each set has the same horizontal position. 
        
           pos_dodge <- function(df, width, n = NULL) { 
        
             if (is.null(n)) { 
        
               n <- length(unique(df$group)) 
        
             } 
        
             if (n == 1) 
        
               return(df) 
        
             if (!all(c("xmin", "xmax") %in% names(df))) { 
        
               df$xmin <- df$x 
        
               df$xmax <- df$x 
        
             } 
        
             d_width <- max(df$xmax - df$xmin) 
        
             # Have a new group index from 1 to number of groups. 
        
             # This might be needed if the group numbers in this set don't include all of 1:n 
        
             groupidx <- match(df$group, sort(unique(df$group))) 
        
             # Find the center for each group, then use that to calculate xmin and xmax 
        
             df$x <- df$x + width * ((groupidx - 0.5) / n - .5) 
        
             df$xmin <- df$x - d_width / n / 2 
        
             df$xmax <- df$x + d_width / n / 2 
        
             df 
        
           }

yutannihilation · 2018-12-03T00:09:04Z

It appears that the various position functions do receive the entire dataset, at least the dataset per panel

I'm afraid not. Position$compute_panel() is called from Position$compute_layer(), and Position$compute_layer() is called from Layer$compute_position(), which is called per layer with each layer's data. So, it doesn't know the other layer's data.

ggplot2/R/plot-build.r

Line 77 in 23a23cd

data <- by_layer(function(l, d) l$compute_position(d, layout))

BTW, I feel this description is not quite right. Maybe, "once per panel per layer"?

ggplot2/R/position-.r

Lines 20 to 21 in 5e4a6ef

    
           #'   - `compute_panel(self, data, params, panel)` is called once per 
        
           #'     panel and should return a modified data frame.

clauswilke · 2018-12-03T00:47:56Z

But that should still be good enough to get the dodging right within each layer and panel. I think the other problem is that we're not using an explicit dodging aesthetic. position_dodge() simply finds all distinct groups at each x position and spreads them out. If we gave it an explicit aesthetic, e.g. aes(dodge = type), or maybe as an optional argument to position_dodge(), e.g. position_dodge(dodge_by = type), then the position adjustment could make smarter decisions about where to place which data points.

slowkow · 2018-12-03T01:02:36Z

Here is another example, building on Claus' code.

It seems that color and fill are not treated the same way by ggplot2. I found this surprising and unexpected -- perhaps this is intended behavior?

library(ggplot2)
df <- data.frame(
  x = c("A", "A", "B"),
  type = c("a", "b", "a")
)

pos <- position_dodge(width = 0.5)

p <- ggplot(df) +
  geom_point(position = pos, shape = 21, size = 10, stroke = 1) +
  geom_text(aes(label = type), color = "black", position = pos)

p + aes(x, 1, color = type)

p + aes(x, 1, color = type, group = type)

p + aes(x, 1, fill = type)

^{Created on 2018-12-02 by the reprex package (v0.2.1)}

Session info

devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.5.1 (2018-07-02)
#>  os       macOS High Sierra 10.13.6   
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2018-12-02                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package     * version    date       lib
#>  assertthat    0.2.0      2017-04-11 [1]
#>  backports     1.1.2      2017-12-13 [1]
#>  base64enc     0.1-3      2015-07-28 [1]
#>  bindr         0.1.1      2018-03-13 [1]
#>  bindrcpp      0.2.2      2018-03-29 [1]
#>  callr         3.0.0      2018-08-24 [1]
#>  cli           1.0.1      2018-09-25 [1]
#>  colorspace    1.3-2      2016-12-14 [1]
#>  crayon        1.3.4      2017-09-16 [1]
#>  curl          3.2        2018-03-28 [1]
#>  desc          1.2.0      2018-05-01 [1]
#>  devtools      2.0.1      2018-10-26 [1]
#>  digest        0.6.18     2018-10-10 [1]
#>  dplyr         0.7.8      2018-11-10 [1]
#>  evaluate      0.12       2018-10-09 [1]
#>  fs            1.2.6      2018-08-23 [1]
#>  ggplot2     * 3.1.0.9000 2018-12-02 [1]
#>  glue          1.3.0      2018-07-17 [1]
#>  gtable        0.2.0      2016-02-26 [1]
#>  htmltools     0.3.6      2017-04-28 [1]
#>  httr          1.3.1      2017-08-20 [1]
#>  knitr         1.20       2018-02-20 [1]
#>  labeling      0.3        2014-08-23 [1]
#>  lazyeval      0.2.1      2017-10-29 [1]
#>  magrittr      1.5        2014-11-22 [1]
#>  memoise       1.1.0      2017-04-21 [1]
#>  mime          0.6        2018-10-05 [1]
#>  munsell       0.5.0      2018-06-12 [1]
#>  pillar        1.3.0      2018-07-14 [1]
#>  pkgbuild      1.0.2      2018-10-16 [1]
#>  pkgconfig     2.0.2      2018-08-16 [1]
#>  pkgload       1.0.2      2018-10-29 [1]
#>  plyr          1.8.4      2016-06-08 [1]
#>  prettyunits   1.0.2      2015-07-13 [1]
#>  processx      3.2.0      2018-08-16 [1]
#>  ps            1.2.1      2018-11-06 [1]
#>  purrr         0.2.5      2018-05-29 [1]
#>  R6            2.3.0      2018-10-04 [1]
#>  Rcpp          1.0.0      2018-11-07 [1]
#>  remotes       2.0.2      2018-10-30 [1]
#>  rlang         0.3.0.1    2018-10-25 [1]
#>  rmarkdown     1.10       2018-06-11 [1]
#>  rprojroot     1.3-2      2018-01-03 [1]
#>  scales        1.0.0      2018-08-09 [1]
#>  sessioninfo   1.1.1      2018-11-05 [1]
#>  stringi       1.2.4      2018-07-20 [1]
#>  stringr       1.3.1      2018-05-10 [1]
#>  testthat      2.0.1      2018-10-13 [1]
#>  tibble        1.4.2      2018-01-22 [1]
#>  tidyselect    0.2.5      2018-10-11 [1]
#>  usethis       1.4.0      2018-08-14 [1]
#>  withr         2.1.2      2018-03-15 [1]
#>  xml2          1.2.0      2018-01-24 [1]
#>  yaml          2.2.0      2018-07-25 [1]
#>  source                            
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.1)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  Github (tidyverse/ggplot2@23a23cd)
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#>  CRAN (R 3.5.0)                    
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.5/Resources/library

clauswilke · 2018-12-03T01:09:27Z

@slowkow What you're seeing is color = "black" shadowing the color aesthetic in the text layer. Apparently the label aesthetic is not considered when groups are calculated.

library(ggplot2)
df <- data.frame(
  x = c("A", "A", "B"),
  type = c("a", "b", "a")
)

pos <- position_dodge(width = 0.5)

p <- ggplot(df) +
  geom_point(position = pos, shape = 21, size = 10, stroke = 1) +
  geom_text(aes(label = type), position = pos)

p + aes(x, 1, color = type)

^{Created on 2018-12-02 by the reprex package (v0.2.1)}

clauswilke · 2018-12-03T01:12:24Z

Yes, labels are not considered when calculating grouping, and that is done by design. (Presumably because it's not uncommon for labels to be all different even within a group.)

ggplot2/R/grouping.r

Lines 7 to 10 in 1c09bae

    
           # If the `group` variable is not present, then a new group 
        
           # variable is generated from the interaction of all discrete (factor or 
        
           # character) vectors, excluding `label`. The special value `NO_GROUP` 
        
           # is used for all observations if no discrete variables exist.

yutannihilation · 2018-12-03T01:15:23Z

to get the dodging right within each layer and panel.

Sorry, I don't get the point yet... Are we talking about the inconsistency of the positions between layers, not within each layer, right?

Letting positions to have aesthetics sounds cool to me, which you've also indicated in #2977 (comment).

clauswilke · 2018-12-03T02:13:41Z

I am talking within each layer. I think there should be an option that guarantees that dodging always looks the same across all x values. In the example here, we would want type = "a" always be dodged to the left and type = "b" always be dodged to the right, regardless of whether the other type is present at a given x or not. As a side effect, this would fix the original problem.

clauswilke · 2018-12-03T02:17:33Z

On a related note, see this closed PR that wasn't merged, and the issue of violins moving in the wrong spot under preserve = "single": #2813

It's the same problem. The dodging doesn't know about the variable that it is dodging by, and therefore it does strange things.

yutannihilation · 2018-12-03T08:11:07Z

Thanks, I got what you mean. It's still unclear to me how to map groups to dodged positions without training over all layers,, but I think I'll find it later :)

In case this is still useful, here's another version of reprex which I believe is minimal for this issue:

library(ggplot2)

d <- data.frame(x = c("x", "x"), g = c("a", "b"), stringsAsFactors = FALSE)
pos <- position_dodge(width = .5)

ggplot(mapping = aes(x, 0, colour = g, label = g)) +
  geom_point(data = d, size = 5, position = pos) +
  geom_label(data = d[2, ], size = 5, position = pos)

^{Created on 2018-12-03 by the reprex package (v0.2.1)}

karawoo · 2018-12-04T23:10:08Z

I think there should be an option that guarantees that dodging always looks the same across all x values. In the example here, we would want type = "a" always be dodged to the left and type = "b" always be dodged to the right, regardless of whether the other type is present at a given x or not. As a side effect, this would fix the original problem.

This has been requested before in #2076 and I agree that it would be a nice feature to have, though if I remember correctly it would require some significant refactoring. We'd also have to think through how geoms with different widths across groups would get placed (e.g. box plots with varwidth = TRUE). For this reason I don't know that fixing this would solve the original problem unless the position calculation knew about other layers. One of the things that's tricky about dodging points and labels in particular is that they have no width in the data space, so the position calculations that calculate where things go based on width don't work right.

hadley · 2019-06-18T14:59:07Z

Is this the same issue as #2480?

karawoo · 2019-06-18T16:12:16Z

yes I think so

teunbrand · 2024-12-05T09:57:48Z

I think this issue was fixed in #5928, where we can now use position_dodge(preserve = "single") for points. As such, I'll close this issue.

devtools::load_all("~/packages/ggplot2")
#> ℹ Loading ggplot2

set.seed(1337)

df <- data.frame(
  x = rnorm(500),
  g1 = factor(sample(c("A", "B"), 500, replace = TRUE)),
  g2 = factor(sample(c("A", "B"), 500, replace = TRUE)),
  rownames = 1:500
)

is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}

df_outliers <- df |> dplyr::group_by(g1, g2) |> dplyr::mutate(outlier = is_outlier(x))

ggplot(df_outliers, aes(x = g1, y = x, fill = g2)) +
  geom_boxplot(width = 0.3, position = position_dodge(0.5)) +
  ggrepel::geom_label_repel(
    min.segment.length = 0,
    data = ~ dplyr::filter(.x, outlier),
    mapping = aes(label = rownames),
    position = position_dodge(0.5, preserve = "single")
  )

^{Created on 2024-12-05 with reprex v2.1.1}

teunbrand · 2024-12-05T10:48:52Z

Nevermind, I found a flow with that approach, so I'll reopen this. The flaw will be fixed by #6100 though.

paleolimbot added feature a feature request or enhancement positions 🥇 labels May 23, 2019

hadley mentioned this issue Jun 18, 2019

position_dodge2() should handle both point and interval geoms #2480

Closed

karawoo mentioned this issue Aug 31, 2020

Aligning plots: Implementation of drop = FALSE for position_dodge #3988

Closed

karawoo mentioned this issue Apr 20, 2021

Aligning geoms with preserve = "single": appears not to work with several factors #3647

Closed

teunbrand mentioned this issue Jul 11, 2024

position_dodge not working well with preserve = "single" and geom_text (or geom_point) #5995

Closed

clauswilke mentioned this issue Sep 12, 2024

Aesthetics for position adjustments #6100

Merged

teunbrand closed this as completed Dec 5, 2024

teunbrand reopened this Dec 5, 2024

teunbrand closed this as completed in #6100 Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

text is misplaced with position_dodge() #3022

text is misplaced with position_dodge() #3022

slowkow commented Dec 2, 2018

clauswilke commented Dec 2, 2018

Uh oh!

clauswilke commented Dec 2, 2018

Uh oh!

yutannihilation commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

slowkow commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

yutannihilation commented Dec 3, 2018 •

edited

Loading

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

yutannihilation commented Dec 3, 2018

Uh oh!

karawoo commented Dec 4, 2018

Uh oh!

hadley commented Jun 18, 2019

Uh oh!

karawoo commented Jun 18, 2019

Uh oh!

teunbrand commented Dec 5, 2024

Uh oh!

teunbrand commented Dec 5, 2024

Uh oh!

text is misplaced with position_dodge() #3022

text is misplaced with position_dodge() #3022

Comments

slowkow commented Dec 2, 2018

clauswilke commented Dec 2, 2018

Uh oh!

clauswilke commented Dec 2, 2018

Uh oh!

yutannihilation commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

slowkow commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

yutannihilation commented Dec 3, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

clauswilke commented Dec 3, 2018

Uh oh!

yutannihilation commented Dec 3, 2018

Uh oh!

karawoo commented Dec 4, 2018

Uh oh!

hadley commented Jun 18, 2019

Uh oh!

karawoo commented Jun 18, 2019

Uh oh!

teunbrand commented Dec 5, 2024

Uh oh!

teunbrand commented Dec 5, 2024

Uh oh!

yutannihilation commented Dec 3, 2018 •

edited

Loading