Firestore Crash - Mutation batchIDs must be acknowledged in order #772

SL62 · 2018-02-08T13:00:20Z

[REQUIRED] Step 2: Describe your environment

Xcode version: 9.2
Firebase SDK version: 4.8.1
Firebase Component: Firestore
Component version: 0.10.0

[REQUIRED] Step 3: Describe the problem

Assertion failure in -[FSTLevelDBMutationQueue acknowledgeBatch:streamToken:group:], third_party/firebase/ios/Source/Firestore/Source/Local/FSTLevelDBMutationQueue.mm:235

Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'FIRESTORE INTERNAL ASSERTION FAILED: Mutation batchIDs must be acknowledged in order'

Steps to reproduce:

Using the new disableNetwork() and enableNetwork() methods under certain scenarios causes the app to crash.

Yet to determine the exact scenarios, but it seems to occur when calling disableNetwork() when the app is in the middle of reading data from the web or writing data.
Moving the disableNetwork() method to an earlier part of the code appears to eliminate / reduce the frequency of crashes.

Logs:

* thread #29, queue = 'com.google.firebase.firestore', stop reason = signal SIGABRT
  * frame #0: 0x0000000182ee82e8 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000182ffd2f8 libsystem_pthread.dylib`pthread_kill$VARIANT$mp + 396
    frame #2: 0x0000000182e56fbc libsystem_c.dylib`abort + 140
    frame #3: 0x00000001825f4068 libc++abi.dylib`abort_message + 132
    frame #4: 0x00000001825f4210 libc++abi.dylib`default_terminate_handler() + 304
    frame #5: 0x000000018261c810 libobjc.A.dylib`_objc_terminate() + 124
    frame #6: 0x000000018260c54c libc++abi.dylib`std::__terminate(void (*)()) + 16
    frame #7: 0x000000018260bea8 libc++abi.dylib`__cxa_throw + 136
    frame #8: 0x000000018261c65c libobjc.A.dylib`objc_exception_throw + 364
    frame #9: 0x00000001833d3038 CoreFoundation`+[NSException raise:format:arguments:] + 104
    frame #10: 0x0000000183d6d7f4 Foundation`-[NSAssertionHandler handleFailureInMethod:object:file:lineNumber:description:] + 112
    frame #11: 0x0000000102a48040 <AppName>`-[FSTLevelDBMutationQueue acknowledgeBatch:streamToken:group:] + 480
    frame #12: 0x0000000102a52d2c <AppName>`-[FSTLocalStore acknowledgeBatchWithResult:] + 252
    frame #13: 0x0000000102a3bbb0 <AppName>`-[FSTSyncEngine applySuccessfulWriteWithResult:] + 184
    frame #14: 0x0000000102a73374 <AppName>`-[FSTRemoteStore writeStreamDidReceiveResponseWithVersion:mutationResults:] + 320
    frame #15: 0x0000000102a7ff58 <AppName>`-[FSTWriteStream handleStreamMessage:] + 692
    frame #16: 0x0000000102a7e6c0 <AppName>`__24-[FSTStream writeValue:]_block_invoke + 468
    frame #17: 0x0000000105d712cc libdispatch.dylib`_dispatch_call_block_and_release + 24
    frame #18: 0x0000000105d7128c libdispatch.dylib`_dispatch_client_callout + 16
    frame #19: 0x0000000105d7ff80 libdispatch.dylib`_dispatch_queue_serial_drain + 696
    frame #20: 0x0000000105d747ec libdispatch.dylib`_dispatch_queue_invoke + 332
    frame #21: 0x0000000105d80f6c libdispatch.dylib`_dispatch_root_queue_drain_deferred_wlh + 428
    frame #22: 0x0000000105d88020 libdispatch.dylib`_dispatch_workloop_worker_thread + 652
    frame #23: 0x0000000182ffaf1c libsystem_pthread.dylib`_pthread_wqthread + 932
    frame #24: 0x0000000182ffab6c libsystem_pthread.dylib`start_wqthread + 4

The text was updated successfully, but these errors were encountered:

wilhuff · 2018-02-13T20:22:09Z

We're tracking this as b/73293239 internally and are looking into this.

ryanwilson · 2018-03-07T13:40:15Z

This fix is now published in Firebase 4.10.0.

SL62 · 2018-03-07T16:05:27Z

I just updated to the latest Firebase SDK 4.10.0 and unfortunately I'm still getting the crash.
The stack trace (below) is a bit different now.

Could this issue be reopened please?

* thread #32, queue = 'com.google.firebase.firestore', stop reason = signal SIGABRT
    frame #0: 0x0000000184ad02e8 libsystem_kernel.dylib`__pthread_kill + 8
    frame #1: 0x0000000184be52f8 libsystem_pthread.dylib`pthread_kill$VARIANT$mp + 396
    frame #2: 0x0000000184a3efbc libsystem_c.dylib`abort + 140
    frame #3: 0x00000001841dc068 libc++abi.dylib`abort_message + 132
    frame #4: 0x00000001841dc210 libc++abi.dylib`default_terminate_handler() + 304
    frame #5: 0x0000000184204810 libobjc.A.dylib`_objc_terminate() + 124
    frame #6: 0x00000001841f454c libc++abi.dylib`std::__terminate(void (*)()) + 16
    frame #7: 0x00000001841f3ea8 libc++abi.dylib`__cxa_throw + 136
    frame #8: 0x000000018420465c libobjc.A.dylib`objc_exception_throw + 364
    frame #9: 0x0000000184fbb038 CoreFoundation`+[NSException raise:format:arguments:] + 104
    frame #10: 0x00000001859557f4 Foundation`-[NSAssertionHandler handleFailureInMethod:object:file:lineNumber:description:] + 112
    frame #11: 0x0000000104b0be14 AppName`-[FSTLevelDBMutationQueue acknowledgeBatch:streamToken:group:] + 480
    frame #12: 0x0000000104b17d30 AppName`-[FSTLocalStore acknowledgeBatchWithResult:] + 252
    frame #13: 0x0000000104afdc44 AppName`-[FSTSyncEngine applySuccessfulWriteWithResult:] + 172
    frame #14: 0x0000000104b3eccc AppName`-[FSTRemoteStore writeStreamDidReceiveResponseWithVersion:mutationResults:] + 296
    frame #15: 0x0000000104b4e95c AppName`-[FSTWriteStream handleStreamMessage:] + 696
    frame #16: 0x0000000104b4cbf8 AppName`-[FSTStream writeValue:] + 452
    frame #17: 0x0000000104b4a774 AppName`-[FSTCallbackFilter writeValue:] + 88
  * frame #18: 0x0000000106cbcc08 RxLibrary`__57-[GRXConcurrentWriteable enqueueValue:completionHandler:]_block_invoke((null)=<unavailable>) at GRXConcurrentWriteable.m:60
    frame #19: 0x0000000107fd52cc libdispatch.dylib`_dispatch_call_block_and_release + 24
    frame #20: 0x0000000107fd528c libdispatch.dylib`_dispatch_client_callout + 16
    frame #21: 0x0000000107fe3f80 libdispatch.dylib`_dispatch_queue_serial_drain + 696
    frame #22: 0x0000000107fd87ec libdispatch.dylib`_dispatch_queue_invoke + 332
    frame #23: 0x0000000107fe4f6c libdispatch.dylib`_dispatch_root_queue_drain_deferred_wlh + 428
    frame #24: 0x0000000107fec020 libdispatch.dylib`_dispatch_workloop_worker_thread + 652
    frame #25: 0x0000000184be2f1c libsystem_pthread.dylib`_pthread_wqthread + 932
    frame #26: 0x0000000184be2b6c libsystem_pthread.dylib`start_wqthread + 4

SL62 · 2018-03-07T16:09:19Z

Just to add a bit more info, the crash happens here:

@interface GRXConcurrentWriteable ()
// This is atomic so that cancellation can nillify it from any thread.
@property(atomic, strong) id<GRXWriteable> writeable;
@end

@implementation GRXConcurrentWriteable {
  dispatch_queue_t _writeableQueue;
  // This ensures that writesFinishedWithError: is only sent once to the writeable.
  BOOL _alreadyFinished;
}

- (instancetype)init {
  return [self initWithWriteable:nil];
}

// Designated initializer
- (instancetype)initWithWriteable:(id<GRXWriteable>)writeable
                    dispatchQueue:(dispatch_queue_t)queue {
  if (self = [super init]) {
    _writeableQueue = queue;
    _writeable = writeable;
  }
  return self;
}

- (instancetype)initWithWriteable:(id<GRXWriteable>)writeable {
  return [self initWithWriteable:writeable
                   dispatchQueue:dispatch_get_main_queue()];
}

- (void)enqueueValue:(id)value completionHandler:(void (^)(void))handler {
  dispatch_async(_writeableQueue, ^{
    // We're racing a possible cancellation performed by another thread. To turn all already-
    // enqueued messages into noops, cancellation nillifies the writeable property. If we get it
    // before it's nil, we won the race.
    id<GRXWriteable> writeable = self.writeable;
    if (writeable) {
      [writeable writeValue:value]; //<----------- Crashes at this line
      handler();
    }
  });
}

mikelehen · 2018-03-07T16:43:23Z

Reopening issue. Would it be possible to capture logs of the issue by adding FirebaseConfiguration.shared.setLoggerLevel(.debug) to your app?

SL62 · 2018-03-07T16:58:10Z

Thanks for reopening.
I've added the logger as requested and I'm getting the following in the debug console:

2018-03-07 16:50:56.860098+0000 AppName[8052:2612896] *** Assertion failure in -[FSTLevelDBMutationQueue acknowledgeBatch:streamToken:group:], third_party/firebase/ios/Source/Firestore/Source/Local/FSTLevelDBMutationQueue.mm:235
2018-03-07 16:50:56.860098+0000 AppName[8052:2612857] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c0314130 response: <GCFSWriteResponse 0x1c04c0620>: {
    stream_token: "\020\001\031\020hB\002\201\364\265\265"
    write_results {
      update_time {
        seconds: 1518214069
        nanos: 122079000
      }
    }
    commit_time {
      seconds: 1520441456
      nanos: 42195000
    }
}
2018-03-07 16:50:56.864805+0000 AppName[8052:2612896] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: 'FIRESTORE INTERNAL ASSERTION FAILED: Mutation batchIDs must be acknowledged in order'
*** First throw call stack:
(0x184fbb164 0x184204528 0x184fbb038 0x1859557f4 0x104cf3e14 0x104cffd30 0x104ce5c44 0x104d26ccc 0x104d3695c 0x104d34bf8 0x104d32774 0x106e98c08 0x1081b12cc 0x1081b128c 0x1081bff80 0x1081b47ec 0x1081c0f6c 0x1081c8020 0x184be2f1c 0x184be2b6c)
libc++abi.dylib: terminating with uncaught exception of type NSException

Does this help or do you need more of the logs?

gsoltis · 2018-03-07T17:20:15Z

More of the logs would be helpful, especially around where disableNetwork is called, if you have them.

SL62 · 2018-03-07T17:44:14Z

No problem, I've added all logs between calling disableNetwork up until just after the completion handler returns for enableNetwork:

2018-03-07 17:30:16.195321+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWatchStream 0x1c04eca00 stop
2018-03-07 17:30:16.203204+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWatchStream 0x1c04eca00 Performing stream teardown
2018-03-07 17:30:16.212876+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWatchStream 0x1c04eca00 Closing stream client-side
2018-03-07 17:30:16.213454+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c0310e00 stop
2018-03-07 17:30:16.213946+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c0310e00 Performing stream teardown
2018-03-07 17:30:16.214401+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c0310e00 mutation request: <GCFSWriteRequest 0x1c0afaf80>: {
    stream_token: "\020\007\031\020hB\002\201\364\265\265"
}
2018-03-07 17:30:16.215678+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c0310e00 Closing stream client-side
2018-03-07 17:30:16.216123+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Stopping write stream with 0 pending writes
2018-03-07 17:30:16.235158+0000 AppName[8116:2633389] 4.10.0 - [Firebase/Analytics][I-ACS031006] View controller already tracked. Class, ID: InboxVC, -4887308447020283543
2018-03-07 17:30:16.273351+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.278813+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.279138+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.301166+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.301731+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.301981+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] Committing <FSTWriteGroup for Release query: 0 changes (0 bytes):>
2018-03-07 17:30:16.302154+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWatchStream 0x1c08ff900 start
2018-03-07 17:30:16.302321+0000 AppName[8116:2633289] 4.10.0 - [Firebase/Firestore][I-FST000001] FSTWriteStream 0x1c03085e0 start

Between the above and the actual crash, there's about 2000 lines of logs (mostly due to all the active listeners going back online and a few writes being executed).

mikelehen · 2018-03-07T18:27:35Z

Thanks @SL62. I think we still need more of the logs. Everything in that last snippet looks as-expected. To ensure we understand what's going on, I'd really like /all/ of the Firestore logs leading up to the crash (both before and after the snippet you pasted). I suspect that we're somehow re-sending a write we shouldn't be or else skipping a write. To verify this I'll need to see the preceding writes and their results.

If the logs are too long you could put them in a gist. Or if they contain sensitive data but you're willing to share privately, you can email them to me at [email protected]. Without logs, we'll probably need a concrete way for us to reproduce the error so that we can investigate what's going on.

Thanks and sorry for the inconvenience!

SL62 · 2018-03-07T23:03:52Z

No problem, I've sent you an email with the full log as it does contain sensitive data.
I've also discovered another bug with sdk 4.10.0 as mentioned in the email.
Please let me know if you need any more info.

getNextMutationBatchAfterBatchId() was not respecting highestAcknowledgedBatchId and therefore we were resending writes after the network was disabled / re-enabled as reported here: firebase/firebase-ios-sdk#772

wilhuff · 2018-03-12T15:29:54Z

Hi @SL62 we released Firebase 4.10.1 which addresses the memory corruption issue which was the root cause of the bug you were seeing. Let us know how it goes. Thanks for your patience!

wilhuff · 2018-03-12T16:51:26Z

Whoops. 4.10.1 fixes the secondary issue you emailed about, but we need an additional change to actually address the root cause of this issue.

Port of: firebase/firebase-js-sdk#559 Fixes #772 getNextMutationBatchAfterBatchId() was not respecting highestAcknowledgedBatchId and therefore we were resending writes after the network was disabled / re-enabled.

…559) getNextMutationBatchAfterBatchId() was not respecting highestAcknowledgedBatchId and therefore we were resending writes if they had been acknowledged but not removed (aka the held write acks case). This showed up when a user disabled / enabled the network as reported here and I've included a spec test to emulate that case: firebase/firebase-ios-sdk#772

…909) Port of: firebase/firebase-js-sdk#559 Should address #772 once released. getNextMutationBatchAfterBatchId() was not respecting highestAcknowledgedBatchId and therefore we were resending writes after the network was disabled / re-enabled.

paulb777 · 2018-03-21T03:17:33Z

Fixed in Firebase 4.11.0 FirebaseFirestore 0.10.4

…irebase#909) Port of: firebase/firebase-js-sdk#559 Should address firebase#772 once released. getNextMutationBatchAfterBatchId() was not respecting highestAcknowledgedBatchId and therefore we were resending writes after the network was disabled / re-enabled.

paulb777 added the api: firestore label Feb 8, 2018

morganchen12 assigned wilhuff Feb 8, 2018

wilhuff assigned gsoltis and unassigned wilhuff Feb 13, 2018

gsoltis mentioned this issue Feb 13, 2018

Ignore events on inactive streams #790

Merged

gsoltis closed this as completed in #790 Feb 14, 2018

morganchen12 reopened this Feb 14, 2018

morganchen12 added the future minor release label Feb 14, 2018

ryanwilson closed this as completed Mar 7, 2018

mikelehen reopened this Mar 7, 2018

paulb777 removed the future minor release label Mar 10, 2018

mikelehen mentioned this issue Mar 12, 2018

Fix MutationQueue issue resulting in re-sending acknowledged writes. firebase/firebase-js-sdk#559

Merged

wilhuff closed this as completed Mar 12, 2018

wilhuff reopened this Mar 12, 2018

wilhuff added the future minor release label Mar 12, 2018

mikelehen mentioned this issue Mar 12, 2018

Fix MutationQueue issue resulting in re-sending acknowledged writes. #909

Merged

mikelehen closed this as completed in #909 Mar 12, 2018

morganchen12 reopened this Mar 12, 2018

paulb777 closed this as completed Mar 21, 2018

morganchen12 added released and removed future minor release labels Apr 9, 2018

firebase locked and limited conversation to collaborators Nov 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Firestore Crash - Mutation batchIDs must be acknowledged in order #772

Firestore Crash - Mutation batchIDs must be acknowledged in order #772

SL62 commented Feb 8, 2018

wilhuff commented Feb 13, 2018

ryanwilson commented Mar 7, 2018

SL62 commented Mar 7, 2018

SL62 commented Mar 7, 2018

mikelehen commented Mar 7, 2018

SL62 commented Mar 7, 2018

gsoltis commented Mar 7, 2018

SL62 commented Mar 7, 2018

mikelehen commented Mar 7, 2018

SL62 commented Mar 7, 2018

wilhuff commented Mar 12, 2018

wilhuff commented Mar 12, 2018

paulb777 commented Mar 21, 2018

Firestore Crash - Mutation batchIDs must be acknowledged in order #772

Firestore Crash - Mutation batchIDs must be acknowledged in order #772

Comments

SL62 commented Feb 8, 2018

[REQUIRED] Step 2: Describe your environment

[REQUIRED] Step 3: Describe the problem

Steps to reproduce:

Logs:

wilhuff commented Feb 13, 2018

ryanwilson commented Mar 7, 2018

SL62 commented Mar 7, 2018

SL62 commented Mar 7, 2018

mikelehen commented Mar 7, 2018

SL62 commented Mar 7, 2018

gsoltis commented Mar 7, 2018

SL62 commented Mar 7, 2018

mikelehen commented Mar 7, 2018

SL62 commented Mar 7, 2018

wilhuff commented Mar 12, 2018

wilhuff commented Mar 12, 2018

paulb777 commented Mar 21, 2018