Skip to content

Make endpoint arg order consistent: geo, time #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 4, 2023
Merged

Conversation

dshemetov
Copy link
Contributor

Fixes #24

@dshemetov dshemetov requested a review from krivard March 31, 2022 22:27
Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea but I suspect the argument ordering in this package was designed to match the old/existing delphi_epidata R client interface. Has this change been discussed with Tooling?

@dshemetov
Copy link
Contributor Author

Ah, I will check in with them.

@brookslogan
Copy link
Contributor

brookslogan commented Jan 24, 2023

I forget where we landed on this. IF we go with this change, think there are some changes still required: e.g., covid_hosp_facility to be time then hospital_pks, and the covidcast_epidata interface would need to change as well.

But I'm not sure we want this order. I think time then geo is going to be a bit awkward paired with epiprocess, tsibble, (gtsibble if it materialized), which words things geo/key then time/index. And I think that most groups of endpoints have also been geo then time, which suggests that it might be the more natural order.

Regarding backwards compatibility: I'm mostly worried about people using flusight helper snippets like this one, which unfortunately do not call by name and use the two different orderings (covidcast and covidcast_epidata). I doubt but would like to know if people are using it already for other things. Any new users after this I wouldn't worry too much about: those that transition from covidcast pkg already have to reformat the time ranges; reordering isn't that much more hassle; and those that transition from delphi_epidata can probably deal with this small change (and may have to deal with larger hassles reformatting epiweeks/dates for epiranges).

Some options:

  • No change --> awkward mix within epidatr, with other packages
  • time, geo, don't change epiprocess --> awkward mix with other packages; breaks some helper snippets
  • time, geo + reformat epiprocess? --> might be awkward if paired with tsibble (not common); breaks some helper snippets
  • geo, time --> consistent with other packages; breaks some helper snippets

We should discuss at a tooling meeting soon.

@dshemetov
Copy link
Contributor Author

I don't think I ever got much feedback on this. I'm not tied to a particular order, I just picked one because it seemed like a pretty arbitrary choice. I'm fine with (source_signal, geo, time) though, it makes sense.

I think it's worth enforcing consistency, even if it breaks some helper snippets. That's a one-time cost, whereas user frustration from argument inconsistency within and across packages would be a continual friction.

@brookslogan
Copy link
Contributor

brookslogan commented Feb 1, 2023

Informal poll here also suggests that geo before time is more natural. So unfortunately, I think things will need to be inverted here to follow a geo before time convention instead.

Consistency is definitely a major pro. I am a little nervous about breaking the only code snippets for this package we've really thrown out there to the modeling community to try using, especially with these types of error messages possible (this is the current "wrong" order):

epidatr::covidcast("hhs","confirmed_admissions_covid_1d","state","day","tx","20220101") %>% fetch_tbl()
#> Error in fetch_csv(epidata_call, fields) : Bad Request (HTTP 400).
epidatr::covidcast("hhs","confirmed_admissions_covid_1d","state","day","tx","20220101") %>% fetch_classic()
#> $epidata
#> data frame with 0 columns and 0 rows
#> 
#> $message
#> [1] "not a valid date: tx"
#> 
#> $result
#> [1] -1

although there's also this:

epidatr::covidcast("hhs","confirmed_admissions_covid_1d","state","day","tx",20220101)
#> Error in `check_string_param()`:
#> ! argument geo_values is not a string
#> Run `rlang::last_error()` to see where the error occurred.

which is maybe a little easier to debug. But I imagine we can do better. Perhaps a few early, ad-hoc checks to try to detect someone using the old order (e.g., using "state" for the time type or an epirange for the geo_values) and trying to set them straight + mentioning what version the ordering was swapped around in. And the snippet repo should be updated to require this version for the relevant code snippets [and fix its own breakage]. [Plus fetch_tbl should not be so cryptic; I don't know if we still get the $message and $result if it's a 400 or not though, but I imagine this is more of a separate issue. I'm hoping these other checks I've described would be something limited to a very small subset of endpoints / only covidcast.]

@krivard
Copy link
Contributor

krivard commented Feb 1, 2023

Perhaps a few early, ad-hoc checks to try to detect someone using the old order (e.g., using "state" for the time type or an epirange for the geo_values) and trying to set them straight + mentioning what version the ordering was swapped around in.

+1 on this in particular

@dshemetov
Copy link
Contributor Author

Thanks for doing that poll Logan. Agree with having backwards compatibility for a few versions + a warning message.

If anyone wants to take over this PR, I would be happy with that. I'm not sure when I'll work on these clients again.

@dshemetov dshemetov changed the base branch from dev to ds/check April 29, 2023 05:50
@dshemetov dshemetov changed the title Make endpoint arg order consistent: dates, then geos Make endpoint arg order consistent: geo, time May 1, 2023
Copy link
Contributor

@brookslogan brookslogan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, all my questions are non-blocking.

question: should we also reorder the field info / columns for covidcast?

@brookslogan
Copy link
Contributor

(but please sanity-check my minor changes)

@brookslogan
Copy link
Contributor

Oh, wait...

issue: we don't have smart detection of people trying to use the old order. Should we add?

Also, Dmitry, could you please update flusight-helper-snippets to work with this update if it doesn't already?

@dshemetov
Copy link
Contributor Author

I think we don't need to do smart detection here: the API server error codes should be enough

@dshemetov
Copy link
Contributor Author

I also already fixed create_epidata_call in the other PR

@dshemetov
Copy link
Contributor Author

I don't know what you mean about reodering column order / field info for covidcast. Could you link to code?

Base automatically changed from ds/check to dev May 3, 2023 23:39
dshemetov and others added 2 commits May 3, 2023 17:20
* update tests
* reorder `create_epidata_call` arg order for `covidcast`
* add test-covidcast.R
* correct epidatr vignette param order
* build vignettes, update gitignore
* document testthat >= 3.1.5 dev dependency in suggests

Co-authored-by: Logan C. Brooks <[email protected]>
@dshemetov dshemetov merged commit 0bdcdc2 into dev May 4, 2023
@dshemetov dshemetov deleted the ds/param-order branch May 4, 2023 00:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make endpoints arguments consistent
3 participants