Skip to content

Ensure dc_backup_provider_new behaves well when cancelled #4192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #4180
flub opened this issue Mar 21, 2023 · 11 comments
Closed
Tracked by #4180

Ensure dc_backup_provider_new behaves well when cancelled #4192

flub opened this issue Mar 21, 2023 · 11 comments
Assignees

Comments

@flub
Copy link
Contributor

flub commented Mar 21, 2023

For the UI it is fairly normal that the dc_backup_provider_new() call is cancelled: the user might hit the back button which happens fairly commonly.

There are reports that sometimes it then takes a while for things to be cleared up and immediately calling dc_backup_provider_new() after cancelling makes it fail. Figure this out and make it work nicely.

@flub flub mentioned this issue Mar 21, 2023
25 tasks
@flub flub self-assigned this Mar 24, 2023
@flub
Copy link
Contributor Author

flub commented Mar 24, 2023

I've been poking at this and not finding anything. The only thing I can manage to get is that the ongoing processs has not been freed yet when calling the second prepare: but about 16-17 miliseconds after requesting the stop it is freed and a new process can start. This simply seems to be the time it takes to do those things, it involves acquiring some locks.

@r10s do you have any more hints about what problems exactly you see? How to reproduce?

@link2xt
Copy link
Collaborator

link2xt commented Mar 25, 2023

I have a desktop using the latest core 04daff0 and using desktop PR deltachat/deltachat-desktop#3150 for the UI (specifically commit deltachat/deltachat-desktop@7860cf4). On my Android phone I have an almost up-to-date self-built APK using core 83af248.

I have started offering a backup from Desktop and scanned the QR code using the phone. Backup transfer started, but looked slow to me, so I decided to try another Wi-Fi network. I cancelled backup transfer (likely from the phone, but not sure) and switched both devices to another Wi-Fi network.

Then I tried to start the backup transfer again, showed new QR code from the desktop and scanned it from the phone. The phone was forever trying to connect. I opened the log and looked at the IP address the phone was trying to use: it was correct, I verified by running ip addr on my laptop. Then I tried to cancel connection attempt and generate a new QR code, but failed to transfer again. However, after restarting the Desktop app and starting from scratch, I successfully transferred the backup.

I did not check whether desktop was really listening on the port (should have checked ss -lun or something like this), but it looks like desktop indeed got stuck in an incorrect state where it did not listen or did not respond to connection requests anymore until I restarted the whole app.

r10s added a commit to deltachat/deltachat-android that referenced this issue Mar 26, 2023
there might still be issues in core,
see chatmail/core#4192
r10s added a commit to deltachat/deltachat-android that referenced this issue Mar 27, 2023
there might still be issues in core,
see chatmail/core#4192
@link2xt
Copy link
Collaborator

link2xt commented Mar 30, 2023

Is it fixed by #4249?

@flub
Copy link
Contributor Author

flub commented Mar 31, 2023

Is it fixed by #4249?

I think it's part of it. But I'm not yet confident enough it fixes the entire issue and want to try do more testing.

@r10s
Copy link
Contributor

r10s commented Apr 1, 2023

for larger account, i can still reproduce the following on core 112.4, tested on ios:

  • tap "Settings / Add Second Device"
  • during "prepare", hit "cancel" and confirm abort
  • immediately hit "Add Second Device" again

-> Error: "there is already another ongoing process"

if you wait a bit after the error (no need to quit the app), you succeed in getting the qr code again (iirc, this was not possible on first tries, so things already got better :)

i could not reproduce that on smaller accounts. the following video is on a multi-gb account:

RPReplay_Final1680382983.mov

@flub
Copy link
Contributor Author

flub commented Apr 2, 2023

So that certainly sounds like it is an improvement. There is always going to be some delay. The ongoing mechanism in core allows for some time before you release the ongoing process, as long as that's not more than a few seconds that should be fine. At least that's how it is currently designed.

I guess UX wise that complicates things though, there's no API to wait until the ongoing process is released. With how the ongoing process is designed now (in core) the UI should probably show something like "cancelling, please stand by..." after stopping the ongoing process and once it is freed you can then start doing things again. This also helps with other things, as e.g. the paused IO needs to be resumed again etc. This can't be instantaneous.

@flub
Copy link
Contributor Author

flub commented Apr 2, 2023

So from my core-centric point of view what is missing is an API to wait until core has freed the ongoing process after cancelling/stopping it. Maybe this can be done by making stop_ongoing a blocking call?

@flub
Copy link
Contributor Author

flub commented Apr 2, 2023

#4289 adds logging to see how long it really takes to between stop ongoing and free ongoing.

@r10s
Copy link
Contributor

r10s commented Nov 11, 2024

closing stale issue. things are not that bad, it is just that an error message may appear when trying to start backup too fast after each other, in my tests, nothing crashes or so. a moment later, add-second-device works again as expected.

also, this is mitigated a little bit by (a) pareparation got faster iirc and (b) by a dialog and device-owner-verification beforehand

@r10s r10s closed this as not planned Won't fix, can't repro, duplicate, stale Nov 11, 2024
@link2xt
Copy link
Collaborator

link2xt commented Nov 15, 2024

Probably was fixed in #6027

@r10s
Copy link
Contributor

r10s commented Nov 15, 2024

cool! thanks for fishing that PR out and referencing it here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants