Skip to content

Proposal: Warning when using density2d + coord_map #2702

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mwaldstein opened this issue Jun 15, 2018 · 6 comments
Closed

Proposal: Warning when using density2d + coord_map #2702

mwaldstein opened this issue Jun 15, 2018 · 6 comments

Comments

@mwaldstein
Copy link

Problem

stat_density_2d does not account for non-euclidean coordinates, such as when using Latitude / Longitude. As a result, when densities are incorrect for densities near the poles and 180 Longitude.

note This may apply for any non-cartesian coordinates (e.g. polar), but I have not tested it.

Options

  • Swap in a spherical density function (e.g. from the Density package) when using map coordinates. This feels like a big lift
  • Block use of density with map coordinates (similar to geom_raster) - There are enough cases (e.g. city-level data sets) where the grid approximation is OK that leaving the option available seems appropriate. A counter argument is that if you're using coord_map, you care about projection where densities will be off. The current behavior has been in place for long enough I'd worry about breaking workflows.
  • Include a warning - My proposed option, to make sure people are aware they are using an approximation likely to cause issues at larger scales.

I'm happy to do the work to implement the warning, but wanted confirmation before creating the pull request.

@hadley
Copy link
Member

hadley commented Jun 15, 2018

This problem happens for basically any stat and a non-Cartesian coordinate system so I don’t think singling out one combination is the right approach.

@mwaldstein
Copy link
Author

Generally agree.

Part of the difficulty is that there are a lot of examples in the wild of using these functions for map heatmaps, particularly at the city level. A novice (read: me) with low understanding of how the densities are calculated would readily assume that since the functions work together, you can apply them to global data.

Another option could be to just add an explicit message to the documentation highlighting that by "2d" it means "Cartesian" 2d.

I was looking at the minimum change (geom_density_2d + coord_map) but a more thorough choice would be to mimic geom_hex and geom_raster and stop on any non-Cartesian for geom_density_2d.

@mwaldstein
Copy link
Author

One more data point in the discussion - blocking geom_density_2d on non-Cartesian coordinates would break many ggmap examples, which likes to show off density plots.

@paleolimbot
Copy link
Member

I think that this is a StatContour problem (used by StatDensity2d), which assumes evenly-spaced x and y values, but doesn't give any warnings if this is not the case. The warning in GeomRaster$setup_data() could probably be generalized:

ggplot2/R/geom-raster.r

Lines 51 to 80 in e2bdf85

setup_data = function(data, params) {
precision <- sqrt(.Machine$double.eps)
hjust <- params$hjust %||% 0.5
vjust <- params$vjust %||% 0.5
x_diff <- diff(sort(unique(as.numeric(data$x))))
if (length(x_diff) == 0) {
w <- 1
} else if (any(abs(diff(x_diff)) > precision)) {
warning("Raster pixels are placed at uneven horizontal intervals and will be shifted. Consider using geom_tile() instead.")
w <- min(x_diff)
} else {
w <- x_diff[1]
}
y_diff <- diff(sort(unique(as.numeric(data$y))))
if (length(y_diff) == 0) {
h <- 1
} else if (any(abs(diff(y_diff)) > precision)) {
warning("Raster pixels are placed at uneven vertical intervals and will be shifted. Consider using geom_tile() instead.")
h <- min(y_diff)
} else {
h <- y_diff[1]
}
data$xmin <- data$x - w * (1 - hjust)
data$xmax <- data$x + w * hjust
data$ymin <- data$y - h * (1 - vjust)
data$ymax <- data$y + h * vjust
data
},

It would be possible for an extension package to calculate density at draw time (i.e., using coordinate-transformed x- and y- values) rather than build time. In most spatial contexts, this is probably what you want anyway, but it's not very ggplot-like and so I think it belongs in another package (like ggspatial or ggmap).

@paleolimbot
Copy link
Member

This probably should be done as part of #3044.

@teunbrand
Copy link
Collaborator

Per comment here the requirement for contours is not a regularly spaced grid so the raster pixel solution would be out of place. I also agree with Hadley that the issue is not specific to this stat/coord combination. In addition, I agree with Dewey that a proper spatial approach should be subject for specialist extensions. As such, I think we can close the issue here in ggplot2.

@teunbrand teunbrand closed this as not planned Won't fix, can't repro, duplicate, stale Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants