Skip to content

Add retrospective gapfilling to CPR hospital admissions signals #1539

Closed
@krivard

Description

@krivard

Because the CPR only includes one (sometimes two) reference dates for each signal, and because the CPR is not published on weekends, the resulting COVIDcast signals are only available 5 days a week. See the green line in this timeseries chart:

image

We should use simple interpolation to fill these gaps retrospectively (once the next file becomes available). Extrapolation on days where no new files are posted is left as a future research project.

We'll need to do something like this:

  • When computing the list of CPR files to process, extend the series backwards by one file ("additional CPR file")
  • Process the CPR files into a df as usual
  • Expand the df to cover a contiguous date sequence, NA-filling
  • Impute the missing values
  • Drop the per-signal reference dates in the additional CPR file
  • Aggregate to nation, export, etc as usual

Complication

The existing imputation utility in delphi_utils.smooth assumes we only ever impute a value for date X using data from dates Y<X. We will probably need to add support for some kind of symmetrical mode. Expected signal data for this change looks like [x1, NA, NA, x4] -- literally, since these gaps mostly occur at weekends. x1 is from the additional CPR file, x4 is from today's CPR file, and we want to publish the two imputed values and x4. This probably means a linear or other low-degree fit that doesn't need a lot of context to work.

This revision must be completed before we can begin showing county-level hospital admissions in the visualizations on the website.

Metadata

Metadata

Assignees

Labels

Priority-P0Must-do; lab will self-destruct without itmath

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions