Skip to content

Add randomtemp to build scripts #448

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 29, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions conda/pytorch-nightly/bld.bat
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,20 @@ if "%CUDA_VERSION%" == "80" (
set "CUDAHOSTCXX=%VS140COMNTOOLS%\..\..\VC\bin\amd64\cl.exe"
)

:: randomtemp is used to resolve the intermittent build error related to CUDA.
:: code: https://github.com/peterjc123/randomtemp
:: issue: https://github.com/pytorch/pytorch/issues/25393
::
:: Previously, CMake uses CUDA_NVCC_EXECUTABLE for finding nvcc and then
:: the calls are redirected to sccache. sccache looks for the actual nvcc
:: in PATH, and then pass the arguments to it.
:: Currently, randomtemp is placed before sccache (%TMP_DIR_WIN%\bin\nvcc)
:: so we are actually pretending sccache instead of nvcc itself.
curl -kL https://github.com/peterjc123/randomtemp/releases/download/v0.3/randomtemp.exe --output %SRC_DIR%\tmp_bin\randomtemp.exe
set RANDOMTEMP_EXECUTABLE=%SRC_DIR%\tmp_bin\nvcc.exe
set CUDA_NVCC_EXECUTABLE=%SRC_DIR%\tmp_bin\randomtemp.exe
Comment on lines +78 to +79
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be other way around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give an example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we can only workground this issue by retrying it several times.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why RANDOMTEMP_EXECUTABLE is points to nvcc and CUDA_NVCC_EXECUTABLE poitns to randomtemp.exe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because randomtemp.exe acts like a shim exectuable, it just passes the argument to the actual program, which is specified by the variable RANDOMTEMP_EXECUTABLE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I agree with @malfet this does seem a bit confusing when you look at it.

Maybe it'd be better to add a comment here explaining why they point where they point to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand how it works, but naming of variables is a bit confusing, but indeed it matches one on master

set RANDOMTEMP_BASEDIR=%SRC_DIR%\tmp_bin

:cuda_end

set CMAKE_GENERATOR=Ninja
Expand Down
14 changes: 14 additions & 0 deletions windows/build_pytorch.bat
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,20 @@ if "%USE_SCCACHE%" == "1" (
set CUDA_NVCC_EXECUTABLE=%CD%\tmp_bin\nvcc
set ADDITIONAL_PATH=%CD%\tmp_bin
set SCCACHE_IDLE_TIMEOUT=1500

:: randomtemp is used to resolve the intermittent build error related to CUDA.
:: code: https://github.com/peterjc123/randomtemp
:: issue: https://github.com/pytorch/pytorch/issues/25393
::
:: Previously, CMake uses CUDA_NVCC_EXECUTABLE for finding nvcc and then
:: the calls are redirected to sccache. sccache looks for the actual nvcc
:: in PATH, and then pass the arguments to it.
:: Currently, randomtemp is placed before sccache (%TMP_DIR_WIN%\bin\nvcc)
:: so we are actually pretending sccache instead of nvcc itself.
curl -kL https://github.com/peterjc123/randomtemp/releases/download/v0.3/randomtemp.exe --output %CD%\tmp_bin\randomtemp.exe
set RANDOMTEMP_EXECUTABLE=%CD%\tmp_bin\nvcc.exe
set CUDA_NVCC_EXECUTABLE=%CD%\tmp_bin\randomtemp.exe
set RANDOMTEMP_BASEDIR=%CD%\tmp_bin
)
)

Expand Down