Skip to content

specially handle as_of==$TODAY #1161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
melange396 opened this issue May 4, 2023 · 3 comments
Open

specially handle as_of==$TODAY #1161

melange396 opened this issue May 4, 2023 · 3 comments
Labels
api change affect the API and its responses blocked Work cannot proceed until conditions are met enhancement

Comments

@melange396
Copy link
Collaborator

We have seen many entries in our logs of people requesting covidcast data "as_of" the current date. Presumably they think that this ensures that they get the most recent information available -- which it does, but it comes at a cost because as_of queries are more complicated (and thus are more computationally expensive and take longer to return). Such users could get the exact same results by omitting the as_of altogether.

To save the user some waiting time and to save us some CPU cycles, we can catch the case when the as_of argument is set to "today", and treat it as though the argument was not included at all.

@melange396 melange396 added enhancement api change affect the API and its responses labels May 4, 2023
@krivard krivard added the blocked Work cannot proceed until conditions are met label May 4, 2023
@krivard
Copy link
Contributor

krivard commented May 4, 2023

We told people to do this as a data quality workaround when latest was out of date with full. Before we add special handling for it, we need to make sure that

  • definitely: the data quality workaround is no longer needed
  • ideally: we have a data validation system checking for recurrences of data quality problems that require the workaround

@brookslogan
Copy link
Contributor

Some alternative user reasoning: When I have used as_of today, it is probably from trying to use the same code for pseudoprospective forecasting and true prospective forecasting; I think that's sort of valid. Though now I'll probably try to get everything in an epi_archive, especially when we eventually implement functionality to cache+update epi_archives with new issue data.

Were there also some data sources that can actually have issue data beyond the current date somehow? @krivard I though you mentioned this was possible sometime, not sure if it's what you described above.

@krivard
Copy link
Contributor

krivard commented May 5, 2023

It is technically possible to insert future-dated data through the patching mechanism (a batch issue upload with issue>today) but we'd expect that to fail earlier automated and manual checks first. (cc @neul3)

The problem from above is when new data is added to full, but latest remains on an older issue. When that happens, latest queries return stale data, and the only way to get up-to-date data is to use as-of-today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change affect the API and its responses blocked Work cannot proceed until conditions are met enhancement
Projects
None yet
Development

No branches or pull requests

3 participants