Skip to content

Improve epix_slide() error catching&messaging with datetime versions #300

Open
@brookslogan

Description

@brookslogan

Operating off of commit 953717f, from lcb/make_epix_slide_more_like_reframe.

# Fake some datetime-versioned archive:
ea = archive_cases_dv_subset$clone()
ea$DT <- data.table::copy(ea$DT)
ea$DT$version <-
  ea$DT$version %>%
  as.POSIXct(tz = "US/Eastern") %>%
  `+`(as.difftime(runif(length(.)), units="days"))
# XXX scared of behavior on DST switchover days...
ea$clobberable_versions_start <- NA
ea$versions_end <- max(ea$DT$version)

ea %>%
  epix_slide(function(x,g) {
    tibble::tibble(n = nrow(x))
  }, before = 5)
#> ! `a` and/or `b` is either `NA` or exactly zero, or one is so much
#> smaller than the other that it looks like it's supposed to be zero; see
#> `pqlim` setting.

Extracting from rlang::last_trace():

 11. │     └─epiprocess:::guess_period(versions_with_updates)
 12. │       └─epiprocess:::gcd_num(decayed_skips) at epiprocess/R/utils.R:439:2

guess_period should probably wrap the error from gcd_num (using rlang::abort(....., parent = .....)), to say that its arguments look aperiodic. And maybe epix_slide should do the same, to say that the unique versions don't look like they come from a periodic reporting mechanism, so the user will need to manually specify ref_time_values.

More seriously, we probably should error in this case:

ea %>%
  epix_slide(function(x,g) {
    tibble::tibble(n = nrow(x))
  }, before = 5, ref_time_values = as.POSIXct("2021-01-01 08:00", tz="US/Eastern") + as.difftime(1, units="weeks")*(0:3))
#>  # A tibble: 4 × 2
#>   time_value          slide_value_n
#>   <dttm>                      <int>
#> 1 2021-01-01 08:00:00             0
#> 2 2021-01-08 08:00:00             0
#> 3 2021-01-15 08:00:00             0
#> 4 2021-01-22 08:00:00             0

and suggest to set time_step. E.g., this looks reasonable:

ea %>%
  epix_slide(function(x,g) {
    tibble::tibble(n = nrow(x))
  }, before = 5, time_step = lubridate::days,
  ref_time_values = as.POSIXct("2021-01-01 08:00", tz="US/Eastern") + as.difftime(1, units="weeks")*(0:3))

although maybe we want to relax the (unchecked) restriction that time_step should be a lubridate::period and allow the following, if it's reasonable:

ea %>%
  epix_slide(function(x,g) {
    tibble::tibble(n = nrow(x))
  }, before = 5, time_step = function(dt) as.difftime(dt, units="days"),
  ref_time_values = as.POSIXct("2021-01-01 08:00", tz="US/Eastern") + as.difftime(1, units="weeks")*(0:3))

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1medium priorityREPLImproved print, errors, etc.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions