-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Discrete scale broken with negative limits #3918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I can confirm this issue. Minimal reprex: library(ggplot2)
df <- data.frame(x = 1:14, y = -2:-15)
ggplot(df, aes(x, y)) +
scale_y_discrete(limits=c(-1:-16), labels=LETTERS[1:16]) Created on 2020-03-31 by the reprex package (v0.3.0) @paleolimbot Any idea what might be going on here? The problem seems to be related to negative limits in |
This is a bug in the new axis implantation (I think). It will get fixed in ggplot2 |
I'll take a look today! |
Dewey, positive limits look like this: library(ggplot2)
df <- data.frame(x = 1:14, y = 2:15)
ggplot(df, aes(x, y)) +
scale_y_discrete(limits = c(1:16), labels = LETTERS[1:16]) Created on 2020-03-31 by the reprex package (v0.3.0) I would expect negative limits to go the other way, as they have the opposite sort order. |
Hi, all. Yes with the negative limits, the sort order is the opposite. The "good example" that I shared with my post earlier is using exactly the same script (with the negative limits) but it's on ggplot2 3.2.1. ver? And that version created that good example which look fine and not really compressed like what I see with my "bad example." Not sure if this is truly a version problem but that is the only thing that I can think of. Would compiler installation or java version possibly cause the conflict with ggplot2? |
One addition: I tried the reprex on my malfunctioning R by (1) uninstalling the ggplot 3.3.0 version; and (2) re-installing ggplot 3.2.1 (downloaded from the archive), and tried the negative limits but still return the same result. I tried the whole process on a new windows computer (to see if change in the operating system makes any difference) - where I freshly installed R/Rstudio and then installed ggplot2 package. I first tried 3.3.0 version which resulted in the same error - then I uninstalled 3.3.0 version and installed 3.2.1 version from the archive - but still got the same error. Interestingly, the Macbook that functions well has R version 3.6.2 and ggplot 3.2.1. - and with this combination, the script works well. Not sure if by upgrading ggplot in that Macbook to 3.3.0 version would return an error but I think that there maybe a compatibility issue between ggplot2 package and the new R version (3.6.3). |
Yes, it's the new expansion code that's in 3.3.0. I think it requires special handling of integer limits...hang tight. |
I have a PR open for this, but out of curiosity, isn't the following more realistic usage? library(ggplot2)
df <- data.frame(x = 1:14, y = -2:-15)
ggplot(df, aes(x, y)) +
scale_y_continuous(breaks = -1:-16, labels = LETTERS[1:16]) Created on 2020-03-31 by the reprex package (v0.3.0) |
Thinking more about this, I think it would be more correct to error if somebody attempts to pass continuous limits to a discrete scale: |
Thanks, I tried "continuous" instead of "discrete" and that seems to work fine. I think I can edit the code in the way you suggested. The original developer of that script (who is my research collaborator) probably tried discrete because what she wanted to do was to create that 384 "grid" where we can layout our heatmap (so probably thought of discrete rather than continuous?) - and probably used negative values to allow the proper labeling (A on the top and P on the bottom). The code that follows the "Plot" is:
So this creates a heatmap with individual rectangles colored on varying gray colors depending on the "Number_of_Objects." But with the continuous scale, the heatmap also seems to be generated fine so I think I can go to that route. I will share if I notice any problem while I use the "scale_y_continuous" |
I'm always of the opinion that if there's a reasonable interpretation of the input possible then we should go that route rather than erroring out. Do you really want to distinguish between A different way to think about it: Shouldn't a discrete scale just take any limit values given and treat them as the distinct levels of a factor? |
Sure, but what would you expect library(ggplot2)
df <- data.frame(x = 1:10, y = 1:10)
ggplot(df, aes(x, y)) +
scale_y_discrete(limits = c(1, 2, 10), labels = c("A", "B", "C")) Created on 2020-04-01 by the reprex package (v0.3.0) ( |
I would expect them to be equally spaced, yes. In fact, I think the scale names are a bit confusing. The defining characteristic is not discrete/continuous, it's categorical/numerical or qualitative/quantitative, I believe. Having said that, I don't know whether treating the limits of |
I stuck |
I don't remember for sure, but the complexity comes with something |
Yes, categorical axes definitely need to be able to handle numerical values, e.g. 3.5 as half-way between 3 and 4. I think the question we're discussion here is different. Should a categorical scale ever take the limit values into account when determining the spacing between categories? My sense would be that it should not. But maybe these two things are not clearly separated in the code and that creates the issues? |
The behaviour of the continuous/discrete ranges is pretty well tested. I think we just never considered what would happen if somebody typed |
Dewey, I think we mostly agree. The main difference is our interpretation of what was meant to happen when somebody types |
We're 100% in agreement! I'll work something up this eve. |
I recently reinstalled my R/Rstudio due to the issue with one of the packages, and during this process, I also (1) downloaded and installed R development tools and libraries (clang and gfortran) from https://cran.r-project.org/bin/macosx/tools/ (2) downloaded and installed gfortran compiler for R from https://github.com/fxcoudert/gfortran-for-macOS/releases/tag/6.3 (3) uninstalled Java/JDK 14 and installing Java/JDK 11 version; and (4) ran “R CMD javareconf” comman on the terminal. All of these actions were taken to resolve the error associated with one of the packages requiring Java connection.
However, during theses resolution steps, something else seemed to be messed up because the existing function/package/script started working inappropriately (it worked before I tried the resolution steps above). So the problem is (and this problem is SEPARATE from the package problem that I was trying to solve with the resolution steps mentioned above): I run this script that utilizes several packages including ggplot2, tibble, dplyr, etc. to generate my graphs. Just several days ago when I tried to generate these graphs, the script worked fine (see below picture, on the left labeled as a "good example").
But after all the resolution steps that I took as discussed above, the exact same script no longer works properly (see above picture, on the right side labeled as a "bad example" that I'm getting now). I used the exact same script, same packages to generate these graphs but now the graphs are being generated strangely…It seems like the tickmarks on the y axis are now not getting evenly spaced out throughout the whole y-axis and the problem starts at the stage where I create "Plot."
Does anyone have any idea what might have gone wrong during the resolution that could have damaged R (or its associated packages) to cause this problem? I want to go back to the ggplot2 that used to create that "good example"
The data that I used are basically table of different cell numbers in each well ("Number_of_Objects) in 384-well plate and I tried to depict the data using the heatmap approach as shown in the figure. The figure shows the collection of small rectangles corresponding to each well in 384-well plate and they are colored on gray gradient based on the cell density (darker, the greater density). The code that I used to plot this is:
I used the exactly the same script both times and both times the script runs without any error but generates completely different graphs. How can I fix the issue so that I can go back to the "good example" graph state?
During install/reinstall, some of those packages might have been updated as well. I checked the updates on all the packages that were installed and except for "tibble," which is a package that gets called along with those libraries when I load "tidyverse," everything was up-to-date. Here is the exact console output that I get when I call those libraries:
I tried to create a reprex and the output is below. The example data is similar to the structure of my actual dataset - I created it using "datapasta" package's paste as tribble function. One thing to note is that on columns "row" "column" and "Number_of_Objects," there is a letter "L" after every number but it is not supposed to be there. The original sample data that I copied from Excel only contain numbers - and I don't know why "L" got added to every single number in the data. Anyhow...the thing that was strange with the reprex was that... in the rendered reprex, the image saving output was 7x5 whereas, when I actually ran the code on my R, the image was saved as 7x7 (see the last line, that says "saving 7x5 in image" - in my console, when the code was actually run, it says "Saving 7x7 in image")
Also, after tracking down, I realized that creating the "Plot" was indeed the step that was getting messed up? I ran the script below alone, saved it "Plot" as .png and compared it between the working R and my malfunctioning R.
Here are the images that I got. On the left is the good example, and on the right is the one that gets generated off my R.
Not sure if these details provide more clues to what might have happened.
If this is related to the package update, with breaking changes... the good example was generated from Macbook with the following versions of the packages:
ggplot2 3.2.1
tibble 2.1.3
tidyr 1.0.2
readr 1.3.1
tidyverse 1.3.0
purrr 0.3.3.
dplyr 0.8.4
stringr 1.4.0
forcats 0.5.0
R version 3.6.2
R Studio up-to-date
Not sure if this information will be helpful but wanted to at least provide the metrics so that you know where the "good example" is coming from. Again, my malfunctioning machine also used to perform well exactly as this Macbook which generated the "good example."
I understand that the problem can be anything here. If it is indeed the error due to installing different compilers, do I have to uninstall those as well (how can I figure this out)? I thought that the compilers will also be uninstalled when I uninstall the R but is that not the case? Do I have to go through uninstall/reinstall of Java as well? Can "R CMD javareconf" mess up the defaults on ggplot2? What are some commands that I can check on terminal to find a potential bug that I'm having with ggplot2? If this is related to the package update error, what could solve the problem? I already tried uninstalling/reinstalling all the packages used in this script and that didn’t solve the problem. If the issue is not particular to the version of the package, and from the error on my computer... due to incomplete update of packages, etc. how should I solve the problem?
I wanted to bring this up as an issue in case this is a bug inherent to ggplot2. Thank you!
The text was updated successfully, but these errors were encountered: