-
Notifications
You must be signed in to change notification settings - Fork 67
V4 schema revisions candidate #903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V4 schema revisions candidate #903
Conversation
Failing test is the delete-batch from dev, which was left out of the initial testfix push because the v4 dev branch wasn't kept up-to-date during development. Rough timeline was:
You can view detailed aheads and behinds using a branches search, which will also link to all/any open PRs for details on what commits were added. There is also a compare view showing details on what was added to |
a few notes: composite key definitions could be made part of Database class. i used the variable name "ell" because the single lowercase letter "l" looks ambiguous. could do away with update_latest column and just use "WHERE d.delete_latest_id<>NULL" instead of "WHERE d.update_latest=1". could do away with delete_latest_id column if we assume signal_data_id's are synced properly between latest+history tables.
I haven't kept up with this work and don't have bandwidth to cover every line, so only looked at this superficially. Main thing that stands out to me is the high volume of TODOs that should be tracked somewhere (or resolved immediately if possible, like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bones are good, needs some paint
Also found a few items we should discuss together to figure out what to do about them
(VSCode took my inline comments fine, but posted my overall comment to a different repo 🙄) overall looks good to me, mostly adding comments about minor knits, possible refactors, and a few cautions (biggest one is making sure the load table is empty before deleting). seconding Andrew's comment about a lot of TODOs factoring to issues. also I made a PR into this branch that removes some other wip stuff i saw #917 |
Co-authored-by: Andrew Chin <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
Co-authored-by: Katie Mazaitis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
addresses #805
summary:
wip
signalsinsert_or_update_batch
in csv loading / acquisition code can be summarized as:is_latest_issue
processing during load to accommodate new schemata.dbjobs
stuff to run before newly ingested data is available for public consumptionchange details
src/ddl/v4_schema.sql
<--- new schema definitionssrc/acquisition/covidcast/database.py
<--- differentiations on when/where to use the history or the latest tablesrc/server/endpoints/covidcast.py
<--- changes to insert into new schema and SQL (also some wip removals)src/server/_query.py
<--- small addition of QueryBuilder.retable() methodsrc/acquisition/covidcast/dbjobs_runner.py
<--- just some boilerplate for making it runnablesrc/ddl/v4_schema_aliases.sql
<--- just some cross-schema aliasingsrc/ddl/covidcast.sql
<--- removedsrc/acquisition/covidcast/data_dir_readme.md
<--- just removal of wip mention in documentationsrc/acquisition/covidcast/csv_to_database.py
<--- just wip removalsThis PR includes the sum of changes from https://github.com/cmu-delphi/delphi-epidata/compare/7fab1fb..e0fb58b
(I didn't want to lose the commit history from the branch
joe/v4-schema-revisions
, but also wanted to split the full changeset into a few smaller pieces for easier review.still TODOs:
integrations/acquisition/covidcast/test_delete_batch.py
does not currently pass, but I will fix it as soon as I can have a quick discussion with Katie