Skip to content

make_numbered_dir() multi-process-safe #143

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jul 25, 2017

Conversation

aurzenligl
Copy link
Contributor

Solves: #30

Patch uses ".lock" file as means to gain exclusive access to directory to use or to remove. Both user and remover race to create exclusively ".lock" files in directories starting with "prefix". If user wins, directory shall not be removed during usage. If remover wins, directory shall be "removed atomically", i.e. renamed, so that following race condition doesn't happen: 1) two removers pass condition for removing directory, 2) first is preempted, 3) second removes it, 4) user creates new directory with the same name, 5) first remover removes user's directory.

Patch introduces new dependencies to code: uuid, and to tests: multiprocessing. Both can be found in stdlib as early as 2.6.

Tests which passed at time of forking pass still, one which failed fails still:
testing/path/test_svnauth.py::TestSvnWCAuthFunctional::()::test_switch

Krzysztof Laskowski added 9 commits July 19, 2017 02:13
  - cpu_count processes are used
  - more iterations are executed
means of .lock file utilization.

There are two kinds of actors: user(single) and remover(many).
User, having created .lock in directory can be sure that he's the owner
of directory. Remover, having created .lock in directory, is sure that
he's the only owner of directory. Only safe operation for remover to do
is to rename to-be-removed, exlusively-owned directory, pattern of which
is used by remover actors alone. This is needed to avoid situation when
two removers schedule removal of certain directory, one is preempted
and incorrectly removes newly created directory of some working user.
KeyboardInterrupt comes during multiprocess.Pool task execution.
Python 3 solves this problem out of the box, but Python 2 demands this
manual os.exit
…tory

with id outside of "keep" range has to be removed conditionlessly
@RonnyPfannschmidt RonnyPfannschmidt requested a review from hpk42 July 19, 2017 16:45
@RonnyPfannschmidt RonnyPfannschmidt merged commit 30f9978 into pytest-dev:master Jul 25, 2017
@RonnyPfannschmidt
Copy link
Member

@aurzenligl thanks for this great work, it took a while to find the space of mind to go trough and understand 👍

@aurzenligl
Copy link
Contributor Author

It's a pleasure to contribute to project one thinks highly of.

@RonnyPfannschmidt
Copy link
Member

@nicoddemus i think this is the pr we will have to undo if we cnat sort out the windows issue with the x flag to open

@aurzenligl please take a look at #157 since the improved lockfile handling turns out to trigger issues on windows that i missed

@aurzenligl
Copy link
Contributor Author

Current code, in order to use .lock-based mutually exclusive access to directory, uses:

lockfile = path.join('.lock')
if hasattr(lockfile, 'mksymlinkto'):
    lockfile.mksymlinkto(str(mypid))
else:
    lockfile.write(str(mypid), 'wx')

Above condition is never true for windows, hence - ironically - the only platform which uses 'wx'-mode open is the one which doesn't support it:

iswin32 = sys.platform == "win32" or (getattr(os, '_name', False) == 'nt')
# ...
FSBase = not iswin32 and PosixPath or common.PathBase

Fortunately, we have low-level file API accessible in both pythons and both win and linux:
https://docs.python.org/2/library/os.html#os.open
https://docs.python.org/3/library/os.html#os.open

This, instead of lockfile.write should do the trick. It should rethrow py.error.EEXIST to conform to LocalPath APIs though.

try:
    fd = os.open(filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL, 0644)
except IOError:
    # ...
with os.fdopen(fd, 'w') as lockfile:
    # ...

@nicoddemus test which checks multi-process safety of make_numbered_dir is:
path/test_local.py::test_make_numbered_dir_multiprocess_safe
There is no guarantee that it will catch race error, as it depends on filesystem, number of cores, number of loops. It can be said that without 'x' flag code would be neither more nor less safe for multi-process pytest then before pr.

@nicoddemus
Copy link
Member

Thanks a lot @aurzenligl, I will give it a spin tomorrow on my other PR!

@nicoddemus
Copy link
Member

Thanks @aurzenligl the solution seemed to work (sorry for the delay). Just pushed my changes to #157.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants