-
Notifications
You must be signed in to change notification settings - Fork 16
Add retrospective gapfilling to CPR hospital admissions signals #1539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Feels like I'm missing the context for this. Is this from new code somewhere? What is CPR? |
Apologies for the name confusion -- CPR is the DSEW Community Profile Report, one of the new sets of signals developed as part of the omicron effort. Pipeline code is here |
So my initial thought is that getting this into the smooth util would require some involved math. A quick solution could be to use Here is what import pandas as pd
import numpy as np
df = pd.DataFrame({
"time_value": pd.date_range("2022-01-01", "2022-04-01"),
"value": 10*np.sin([i/3 if i % 7 >= 2 else np.nan for i in range(91)])
}).set_index("time_value")
import matplotlib.pyplot as plt
plt.figure()
plt.plot(df)
plt.figure()
plt.plot(df.interpolate(method="time")) |
Notes from pair session with Dmitry: Solving this will require reading in additional, already-processed files, which seems complicated. We've discarded "read in extra files" before as a solution for the "volume and positivity are technically for different reference dates" problem, and we should consider avoiding it here as well. Alternatives:
Mitigating factors:
Next steps:
|
Uh oh!
There was an error while loading. Please reload this page.
Because the CPR only includes one (sometimes two) reference dates for each signal, and because the CPR is not published on weekends, the resulting COVIDcast signals are only available 5 days a week. See the green line in this timeseries chart:
We should use simple interpolation to fill these gaps retrospectively (once the next file becomes available). Extrapolation on days where no new files are posted is left as a future research project.
We'll need to do something like this:
Complication
The existing imputation utility in
delphi_utils.smooth
assumes we only ever impute a value for date X using data from dates Y<X. We will probably need to add support for some kind of symmetrical mode. Expected signal data for this change looks like[x1, NA, NA, x4]
-- literally, since these gaps mostly occur at weekends.x1
is from the additional CPR file,x4
is from today's CPR file, and we want to publish the two imputed values andx4
. This probably means a linear or other low-degree fit that doesn't need a lot of context to work.This revision must be completed before we can begin showing county-level hospital admissions in the visualizations on the website.
The text was updated successfully, but these errors were encountered: