From b802956e268ae264ffa1765e0b4f2870508045f0 Mon Sep 17 00:00:00 2001 From: UniversidadNacionalAsuncion Date: Sun, 24 Aug 2025 21:33:05 -0300 Subject: [PATCH] Update cars.R MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi @wch, I added a conversion of the base R `cars` dataset into SI units and included the conversion script in my repository: https://github.com/UniversidadNacionalAsuncion/Modelos_Lineales_Generalizados/blob/main/Regresi%C3%B3n%20Lineal%20con%20R/mbayru-script.R Summary of what the script does - Loads `cars` from `datasets::cars`. - Computes SI columns: - `speed_ms` = speed (mph) × 0.44704 → meters/second - `dist_m` = dist (ft) × 0.3048 → meters - Rounds converted values to 6 decimal places for reproducible precision. - Writes out a file `mbay_ruguata.R` that defines `mbay_ruguata <- data.frame(...)` with the SI columns. - Prints usage instructions: `source('mbay_ruguata.R')`. Why I recommend accepting this change - Teaching & reproducibility: Many textbooks and courses use SI units. Providing SI columns removes a common friction point for instructors and students, letting them run examples without adding conversion steps. - Interoperability: SI units make the dataset easier to use across modern workflows (including Python users), reducing unit-related errors and improving cross-language reproducibility. - Keeps R competitive: Python datasets and libraries often ship with preprocessed, well-documented examples. Small, low-risk improvements like this help keep R’s base datasets immediately useful for contemporary data science workflows. - Low risk: The change is non-intrusive — it adds new columns rather than overwriting originals, so existing code that expects the original `speed` and `dist` is unaffected. Technical notes / transparency - Conversion constants used in the script: - `mph_to_ms <- 0.44704` (1 mph = 0.44704 m/s) - `ft_to_m <- 0.3048` (1 ft = 0.3048 m) - The current script generates the object `mbay_ruguata` and writes only the SI columns into `mbay_ruguata.R`. If you prefer: 1. I can modify the script so the generated R file contains the full dataset (original columns + SI columns), or 2. I can add the SI columns directly into the repo’s `data/cars` file so the dataset in-place contains both unit systems. - If you have a preferred naming convention (e.g., `speed_m_s` / `dist_m` vs `speed_ms` / `dist_m` or `speed_si` / `dist_si`), tell me which you prefer and I’ll update the PR accordingly. Requested action - Please review the script and the generated data. If acceptable, I’d appreciate merging this change so `cars` users can immediately work in SI units without extra steps. If you want a different output format or naming convention, I can update the PR right away. Thanks for maintaining this important resource — small improvements like this have a big impact for teaching and reproducible workflows. Best, Derlis Sosa derlisdev@gmail.com --- src/library/datasets/data/cars.R | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/library/datasets/data/cars.R b/src/library/datasets/data/cars.R index 32501e29f85..b5ba3a8ec4a 100644 --- a/src/library/datasets/data/cars.R +++ b/src/library/datasets/data/cars.R @@ -4,4 +4,18 @@ speed = c(4, 4, 7, 7, 8, 9, 10, 10, 10, 11, 11, 12, 12, 12, 12, 13, 18, 18, 19, 19, 19, 20, 20, 20, 20, 20, 22, 23, 24, 24, 24, 24, 25), dist = c(2, 10, 4, 22, 16, 10, 18, 26, 34, 17, 28, 14, 20, 24, 28, 26, 34, 34, 46, 26, 36, 60, 80, 20, 26, 54, 32, 40, 32, 40, 50, 42, 56, - 76, 84, 36, 46, 68, 32, 48, 52, 56, 64, 66, 54, 70, 92, 93, 120, 85)) + 76, 84, 36, 46, 68, 32, 48, 52, 56, 64, 66, 54, 70, 92, 93, 120, 85) + speed_ms = c( 1.78816, 1.78816, 3.12928, 3.12928, 3.57632, 4.02336, + 4.4704, 4.4704, 4.4704, 4.91744, 4.91744, 5.36448, 5.36448, 5.36448, + 5.36448, 5.81152, 5.81152, 5.81152, 5.81152, 6.25856, 6.25856, 6.25856, + 6.25856, 6.7056, 6.7056, 6.7056, 7.15264, 7.15264, 7.59968, 7.59968, + 7.59968, 8.04672, 8.04672, 8.04672, 8.04672, 8.49376, 8.49376, 8.49376, + 8.9408, 8.9408, 8.9408, 8.9408, 8.9408, 9.83488, 10.28192, 10.72896, + 10.72896, 10.72896, 10.72896, 11.176 ), + dist_m = c( 0.6096, 3.048, 1.2192, 6.7056, 4.8768, 3.048, 5.4864, 7.9248, + 10.3632, 5.1816, 8.5344, 4.2672, 6.096, 7.3152, 8.5344, 7.9248, 10.3632, + 10.3632, 14.0208, 7.9248, 10.9728, 18.288, 24.384, 6.096, 7.9248, 16.4592, + 9.7536, 12.192, 9.7536, 12.192, 15.24, 12.8016, 17.0688, 23.1648, 25.6032, + 10.9728, 14.0208, 20.7264, 9.7536, 14.6304, 15.8496, 17.0688, 19.5072, + 20.1168, 16.4592, 21.336, 28.0416, 28.3464, 36.576, 25.908 ) +)