Skip to content

Make schema migrations idempotent #149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 6, 2018
Merged

Conversation

gsoltis
Copy link
Contributor

@gsoltis gsoltis commented Dec 3, 2018

Adds guards for schema migrations so that they are safe to run multiple times.

@gsoltis gsoltis self-assigned this Dec 3, 2018
@googlebot googlebot added the cla: yes Override cla label Dec 3, 2018
@gsoltis gsoltis force-pushed the gsoltis/idempotent_schema branch from d9621c7 to 802d7c5 Compare December 3, 2018 19:49
@gsoltis gsoltis requested a review from mikelehen December 3, 2018 21:34
@gsoltis gsoltis assigned mikelehen and unassigned gsoltis Dec 3, 2018
@gsoltis
Copy link
Contributor Author

gsoltis commented Dec 3, 2018

I think this is what would be required to be reasonably future proof for our current set of migrations. What do you think about requiring this going forward?

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this looks generally reasonable, and I can review this code for correctness, but first let me ask this: What's your confidence level that this is 100% correct (the SDK will now gracefully handle any combination of downgrading and upgrading without any issue)? If it's not super high, do you have any suggestions for how to gain confidence in both this change and all subsequent code changes to this file (a one-off round of testing probably isn't the answer)?

@mikelehen mikelehen assigned gsoltis and unassigned mikelehen Dec 3, 2018
@gsoltis
Copy link
Contributor Author

gsoltis commented Dec 4, 2018

It definitely will not handle any arbitrary future migration. For instance, deleting columns referenced by previous data migrations will definitely fail. In some cases I am checking for the existence of a single table before creating several tables. A later migration that deletes one of the unchecked tables would also cause a problem. There are likely other cases as well.

I'm reasonably confident that the following method is sound: Only make additive schema changes, guard schema changes against running twice, and always rerun data migrations. Our existing migrations conform to that, and I think I have implemented the guards correctly, but there is not a baked-in guarantee that everyone else will in the future.

The test running all of the migrations twice should catch any schema changes that are unguarded. If we are worried that we might accidentally delete a table or column, we could add tests that iterate through the migrations and assert that nothing is deleted after each one. Data migrations I think are much harder to guarantee, and I don't have a good solution for those.

@gsoltis gsoltis assigned mikelehen and unassigned gsoltis Dec 4, 2018
@gsoltis
Copy link
Contributor Author

gsoltis commented Dec 4, 2018

I've added two tests in the latest commit:

  • Test that we can migrate all the way forward, and then forward again from each previous version, simulating a downgrade and upgrade
  • Test that each individual migration does not delete tables or columns

cc @wilhuff

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few suggestions to try to make the code more futureproof and document the conversations that we've had about this.

My remaining big concern is the other platforms. This won't exactly port cleanly, since web and iOS don't have fixed columns... though maybe that will make them easier? Not sure. And web also may be more complicated since it had extra schema turmoil due to multi-tab. But we could just disallow downgrading below our current version (only adopt the new guarantees going forward).

So I think this is probably good, but we may want to tackle the other platforms before merging this to make sure we don't end up with intolerable inconsistencies. WDYT?

+ "uid TEXT PRIMARY KEY, "
+ "last_acknowledged_batch_id INTEGER, "
+ "last_stream_token BLOB)");
if (!tableExists("mutation_queues")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth adding comments / assertions to make it clear that the code is intentionally assuming that all 3 tables were added at once, especially so somebody doesn't ever try to add something additional here. And maybe the method name should be createV1MutationQueue() or something?

if (!tableExists("mutation_queues")) {
  // mutation_queues, mutations, and document_mutations were all added in schema version X, so
  // they must all exist or all not exist.
  assert(!tableExists("mutations"), "mutations table should not exist if mutation_queues doesn't.");
  assert(!tableExists("document_mutations"), "document_mutations table should not exist if mutation_queues doesn't.");
  ...
} else {
  assert(tableExists("mutations", "mutations table should exist if mutation_queues does."));
  assert(tableExists("document_mutations", "document_mutations table should exist if mutation_queues does."));
}

Or maybe we could wrap that up in a helper function somehow. Or feel free to push back if you think this is overkill.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also add some logging if mutation_queues does exist.... Like "mutation_queues table already exists. Schema must have previously been downgraded."

This will also help readers of the code understand why these if-guards exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method rename definitely sounds good to me. I guess a helper method doesn't sound bad. I'm ok with a little bit of overkill in this case.

+ "highest_listen_sequence_number INTEGER, "
+ "last_remote_snapshot_version_seconds INTEGER, "
+ "last_remote_snapshot_version_nanos INTEGER)");
if (!tableExists("targets")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same feedback here as for mutations, except I think this one would be named createV3QueryCache(), right, since target_globals didn't exist until V3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I named it V1 since it's the same in both cases. target_globals existed but wasn't necessarily populated by the migrations. The query cache lazily created it. However, since it was lazily created, all of that code can handle it existing or not existing.

db.execSQL("DROP TABLE target_globals");
db.execSQL("DROP TABLE target_documents");
// This might be overkill, but if any future migration drops these, it's possible we could try
// dropping tables that don't exist.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could rename this to dropV2QueryCache() and drop the guards, with the expectation that this method would not be used as-is if we need to drop a more-recent version of the QueryCache. I don't think I care strongly though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had to put a version on it, it would probably be V1, since that's the only query cache we create. I think I'm in favor of that, since if we have a V2 query cache, it could be completely separate tables, and the distinction would be useful.

@@ -241,15 +260,20 @@ private void ensureTargetGlobal() {
}

private void addTargetCount() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should be renamed ensureTargetCount() or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a distinction: ensure is for something that might exist already, even assuming no downgrades. For example, sequence numbers have been written by the SDK for a while, but not every document will necessarily have one. So, the migration that fixes that is ensuring the sequence numbers exist. Same with the target_global row.

On the other hand, add and create are for things that aren't expected to exist, assuming no downgrades. They are new in this version of the schema.

long count = DatabaseUtils.queryNumEntries(db, "targets");
db.execSQL("ALTER TABLE target_globals ADD COLUMN target_count INTEGER");
ContentValues cv = new ContentValues();
cv.put("target_count", count);
db.update("target_globals", cv, null, null);
}

private void addSequenceNumber() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add=>ensure again ?

Cursor c = null;
List<String> columns = new ArrayList<>();
try {
c = db.rawQuery("PRAGMA table_info(" + table + ")", null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that SQLite's PRAGMA docs have a couple scary caveats:

  1. Specific pragma statements may be removed and others added in future releases of SQLite. There is no guarantee of backwards compatibility.
  2. No error messages are generated if an unknown pragma is issued. Unknown pragmas are simply ignored. This means if there is a typo in a pragma statement the library does not inform the user of the fact.

But hopefully they wouldn't actually break this...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eek. I guess we could probe for support, if we think it's going to be an issue? Query a known table and ensure we get some results?

@@ -111,29 +114,31 @@ void runMigrations(int fromVersion, int toVersion) {
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comments on runMigrations to 1) not refer to a switch statement that doesn't exist anymore, and 2) explain how fromVersion may be a version we've already upgraded past, 3) communicate the guarantees we're trying to maintain wrt to downgradability. I would put a comment at the END of the method (right where a dev will be looking when they add a new schema migration) that enumerates our backwards compatibility requirements (don't remove tables or columns, make everything idempotent, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. Let me know what you think.

@mikelehen mikelehen assigned gsoltis and unassigned mikelehen Dec 5, 2018
@gsoltis gsoltis assigned mikelehen and unassigned gsoltis Dec 5, 2018
@gsoltis
Copy link
Contributor Author

gsoltis commented Dec 5, 2018

Re: other platforms:

  • Webclient currently doesn't support downgrading, due to IndexedDB throwing an exception. We will have to implement our own versioning if we want to work around this
  • iOS so far has idempotent migrations, but is not saving the version number on downgrade. Save schema version on downgrade, add test to verify firebase-ios-sdk#2153 starts saving the version number, but we don't have the same level of testing as we do here, as the guard doesn't make as much sense to implement.

Copy link
Contributor

@mikelehen mikelehen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM. I left one tiny nit, but you can ignore it if you already got CI to pass and don't want to have to rerun it. :-)

new String[] {"collection_index"},
() -> {
// A per-user, per-collection index for cached documents indexed by a single field's name
// and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrap?

* * Guard schema additions. Check if tables or columns exist before adding them.
* * Data migrations should *probably* always run. Older versions of the SDK will not have
* maintained invariants from later versions, so migrations that update values cannot assume
* that existing values have been properly maintained. Calculate them again, if applicable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comment, thanks.

@gsoltis gsoltis merged commit 5176fb5 into master Dec 6, 2018
@gsoltis gsoltis deleted the gsoltis/idempotent_schema branch December 6, 2018 18:10
@firebase firebase locked and limited conversation to collaborators Oct 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes Override cla size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants