Make schema migrations idempotent #149

gsoltis · 2018-12-03T19:37:28Z

Adds guards for schema migrations so that they are safe to run multiple times.

gsoltis · 2018-12-03T21:35:45Z

I think this is what would be required to be reasonably future proof for our current set of migrations. What do you think about requiring this going forward?

mikelehen

So this looks generally reasonable, and I can review this code for correctness, but first let me ask this: What's your confidence level that this is 100% correct (the SDK will now gracefully handle any combination of downgrading and upgrading without any issue)? If it's not super high, do you have any suggestions for how to gain confidence in both this change and all subsequent code changes to this file (a one-off round of testing probably isn't the answer)?

gsoltis · 2018-12-04T04:06:03Z

It definitely will not handle any arbitrary future migration. For instance, deleting columns referenced by previous data migrations will definitely fail. In some cases I am checking for the existence of a single table before creating several tables. A later migration that deletes one of the unchecked tables would also cause a problem. There are likely other cases as well.

I'm reasonably confident that the following method is sound: Only make additive schema changes, guard schema changes against running twice, and always rerun data migrations. Our existing migrations conform to that, and I think I have implemented the guards correctly, but there is not a baked-in guarantee that everyone else will in the future.

The test running all of the migrations twice should catch any schema changes that are unguarded. If we are worried that we might accidentally delete a table or column, we could add tests that iterate through the migrations and assert that nothing is deleted after each one. Data migrations I think are much harder to guarantee, and I don't have a good solution for those.

gsoltis · 2018-12-04T21:20:04Z

I've added two tests in the latest commit:

Test that we can migrate all the way forward, and then forward again from each previous version, simulating a downgrade and upgrade
Test that each individual migration does not delete tables or columns

cc @wilhuff

mikelehen

I have a few suggestions to try to make the code more futureproof and document the conversations that we've had about this.

My remaining big concern is the other platforms. This won't exactly port cleanly, since web and iOS don't have fixed columns... though maybe that will make them easier? Not sure. And web also may be more complicated since it had extra schema turmoil due to multi-tab. But we could just disallow downgrading below our current version (only adopt the new guarantees going forward).

So I think this is probably good, but we may want to tackle the other platforms before merging this to make sure we don't end up with intolerable inconsistencies. WDYT?

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

mikelehen · 2018-12-05T16:54:15Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

-            + "uid TEXT PRIMARY KEY, "
-            + "last_acknowledged_batch_id INTEGER, "
-            + "last_stream_token BLOB)");
+    if (!tableExists("mutation_queues")) {


I wonder if it's worth adding comments / assertions to make it clear that the code is intentionally assuming that all 3 tables were added at once, especially so somebody doesn't ever try to add something additional here. And maybe the method name should be createV1MutationQueue() or something?

if (!tableExists("mutation_queues")) { // mutation_queues, mutations, and document_mutations were all added in schema version X, so // they must all exist or all not exist. assert(!tableExists("mutations"), "mutations table should not exist if mutation_queues doesn't."); assert(!tableExists("document_mutations"), "document_mutations table should not exist if mutation_queues doesn't."); ... } else { assert(tableExists("mutations", "mutations table should exist if mutation_queues does.")); assert(tableExists("document_mutations", "document_mutations table should exist if mutation_queues does.")); }

Or maybe we could wrap that up in a helper function somehow. Or feel free to push back if you think this is overkill.

Maybe we should also add some logging if mutation_queues does exist.... Like "mutation_queues table already exists. Schema must have previously been downgraded."

This will also help readers of the code understand why these if-guards exist.

Method rename definitely sounds good to me. I guess a helper method doesn't sound bad. I'm ok with a little bit of overkill in this case.

mikelehen · 2018-12-05T16:55:57Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

-            + "highest_listen_sequence_number INTEGER, "
-            + "last_remote_snapshot_version_seconds INTEGER, "
-            + "last_remote_snapshot_version_nanos INTEGER)");
+    if (!tableExists("targets")) {


Same feedback here as for mutations, except I think this one would be named createV3QueryCache(), right, since target_globals didn't exist until V3?

Eh, I named it V1 since it's the same in both cases. target_globals existed but wasn't necessarily populated by the migrations. The query cache lazily created it. However, since it was lazily created, all of that code can handle it existing or not existing.

mikelehen · 2018-12-05T16:59:22Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

-    db.execSQL("DROP TABLE target_globals");
-    db.execSQL("DROP TABLE target_documents");
+    // This might be overkill, but if any future migration drops these, it's possible we could try
+    // dropping tables that don't exist.


Alternatively, we could rename this to dropV2QueryCache() and drop the guards, with the expectation that this method would not be used as-is if we need to drop a more-recent version of the QueryCache. I don't think I care strongly though.

If we had to put a version on it, it would probably be V1, since that's the only query cache we create. I think I'm in favor of that, since if we have a V2 query cache, it could be completely separate tables, and the distinction would be useful.

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

mikelehen · 2018-12-05T17:05:22Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

@@ -241,15 +260,20 @@ private void ensureTargetGlobal() {
  }

  private void addTargetCount() {


Perhaps this should be renamed ensureTargetCount() or something?

I think there's a distinction: ensure is for something that might exist already, even assuming no downgrades. For example, sequence numbers have been written by the SDK for a while, but not every document will necessarily have one. So, the migration that fixes that is ensuring the sequence numbers exist. Same with the target_global row.

On the other hand, add and create are for things that aren't expected to exist, assuming no downgrades. They are new in this version of the schema.

mikelehen · 2018-12-05T17:05:52Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

    long count = DatabaseUtils.queryNumEntries(db, "targets");
-    db.execSQL("ALTER TABLE target_globals ADD COLUMN target_count INTEGER");
    ContentValues cv = new ContentValues();
    cv.put("target_count", count);
    db.update("target_globals", cv, null, null);
  }

  private void addSequenceNumber() {


add=>ensure again ?

mikelehen · 2018-12-05T17:08:18Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

+    Cursor c = null;
+    List<String> columns = new ArrayList<>();
+    try {
+      c = db.rawQuery("PRAGMA table_info(" + table + ")", null);


Note that SQLite's PRAGMA docs have a couple scary caveats:

Specific pragma statements may be removed and others added in future releases of SQLite. There is no guarantee of backwards compatibility.

No error messages are generated if an unknown pragma is issued. Unknown pragmas are simply ignored. This means if there is a typo in a pragma statement the library does not inform the user of the fact.

But hopefully they wouldn't actually break this...

Eek. I guess we could probe for support, if we think it's going to be an issue? Query a known table and ensure we get some results?

firebase-firestore/src/test/java/com/google/firebase/firestore/local/SQLiteSchemaTest.java

mikelehen · 2018-12-05T17:19:14Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

@@ -111,29 +114,31 @@ void runMigrations(int fromVersion, int toVersion) {
  }


Can you update the comments on runMigrations to 1) not refer to a switch statement that doesn't exist anymore, and 2) explain how fromVersion may be a version we've already upgraded past, 3) communicate the guarantees we're trying to maintain wrt to downgradability. I would put a comment at the END of the method (right where a dev will be looking when they add a new schema migration) that enumerates our backwards compatibility requirements (don't remove tables or columns, make everything idempotent, etc.).

Added a comment. Let me know what you think.

gsoltis · 2018-12-05T23:56:52Z

Re: other platforms:

Webclient currently doesn't support downgrading, due to IndexedDB throwing an exception. We will have to implement our own versioning if we want to work around this
iOS so far has idempotent migrations, but is not saving the version number on downgrade. Save schema version on downgrade, add test to verify firebase-ios-sdk#2153 starts saving the version number, but we don't have the same level of testing as we do here, as the guard doesn't make as much sense to implement.

mikelehen

Thanks! LGTM. I left one tiny nit, but you can ignore it if you already got CI to pass and don't want to have to rerun it. :-)

mikelehen · 2018-12-06T00:56:33Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

+        new String[] {"collection_index"},
+        () -> {
+          // A per-user, per-collection index for cached documents indexed by a single field's name
+          // and


mikelehen · 2018-12-06T00:57:01Z

firebase-firestore/src/main/java/com/google/firebase/firestore/local/SQLiteSchema.java

+     *  * Guard schema additions. Check if tables or columns exist before adding them.
+     *  * Data migrations should *probably* always run. Older versions of the SDK will not have
+     *    maintained invariants from later versions, so migrations that update values cannot assume
+     *    that existing values have been properly maintained. Calculate them again, if applicable.


Great comment, thanks.

gsoltis self-assigned this Dec 3, 2018

googlebot added the cla: yes Override cla label Dec 3, 2018

google-oss-bot added the size/L label Dec 3, 2018

Make schema migrations idempotent

802d7c5

gsoltis force-pushed the gsoltis/idempotent_schema branch from d9621c7 to 802d7c5 Compare December 3, 2018 19:49

gsoltis requested a review from mikelehen December 3, 2018 21:34

gsoltis assigned mikelehen and unassigned gsoltis Dec 3, 2018

mikelehen reviewed Dec 3, 2018

View reviewed changes

mikelehen assigned gsoltis and unassigned mikelehen Dec 3, 2018

gsoltis assigned mikelehen and unassigned gsoltis Dec 4, 2018

Tests for schema migration regressions

af63153

mikelehen suggested changes Dec 5, 2018

View reviewed changes

mikelehen assigned gsoltis and unassigned mikelehen Dec 5, 2018

Greg Soltis added 2 commits December 5, 2018 15:50

Review feedback

2cdc724

Formatting

4daeb07

gsoltis assigned mikelehen and unassigned gsoltis Dec 5, 2018

mikelehen approved these changes Dec 6, 2018

View reviewed changes

Rewrap

81f138e

gsoltis merged commit 5176fb5 into master Dec 6, 2018

gsoltis deleted the gsoltis/idempotent_schema branch December 6, 2018 18:10

firebase locked and limited conversation to collaborators Oct 13, 2019

		@@ -241,15 +260,20 @@ private void ensureTargetGlobal() {
		}

		private void addTargetCount() {

		@@ -111,29 +114,31 @@ void runMigrations(int fromVersion, int toVersion) {
		}

Make schema migrations idempotent #149

Make schema migrations idempotent #149

Uh oh!

Conversation

gsoltis commented Dec 3, 2018

Uh oh!

gsoltis commented Dec 3, 2018

Uh oh!

mikelehen left a comment

Choose a reason for hiding this comment

Uh oh!

gsoltis commented Dec 4, 2018

Uh oh!

gsoltis commented Dec 4, 2018

Uh oh!

mikelehen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gsoltis commented Dec 5, 2018

Uh oh!

mikelehen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!