Skip to content

Merge from upstream #43

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 64 commits into from
Jul 20, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
52cc073
Implement reshape_as (#9452)
vishwakftw Jul 17, 2018
7d2a178
test_cuda: ensure tests use float and adjust HalfTensor tolerances (#…
hartb Jul 17, 2018
050a258
change stft to have consistent signature with librosa (#9497)
ssnl Jul 17, 2018
e567879
Reenable multiprocessing preserve sharing tests on ASAN. (#9498)
ezyang Jul 17, 2018
d2d4382
Delete flag from THTensor. (#9494)
ezyang Jul 17, 2018
30f849c
Correct model name in caffe2 onnx backend tests
bddppq Jul 17, 2018
8be4657
Add ideep copy for TensorCPU<long> in IDEEPFallbackOp (#9480)
viswanathgs Jul 17, 2018
890037e
Fix (non-reduction) ops over a dimension for n-dimensional empty tens…
gchanan Jul 17, 2018
2249751
Add OptimizerBase::add_parameters (#9472)
goldsborough Jul 17, 2018
9b0c53a
Deduplicate THTensor and THCTensor. (#9495)
ezyang Jul 17, 2018
0fe980c
Memory usage measurement -- Caffe2 (#9017)
lilinyy09 Jul 17, 2018
6116954
oss heatmap_max_keypoint_op
wat3rBro Jul 17, 2018
5c695e3
Implement 2D and 3D alpha_dropout (#9073)
tippisum Jul 18, 2018
1c3580b
Added hash for device (#9246)
vishwakftw Jul 18, 2018
13e0c92
Add Support for count_include_pad in AveragePool in Caffe2 ONNX Backe…
hl475 Jul 18, 2018
c33d2c0
Thread-safe dispatcher table (#9126)
Jul 18, 2018
004d924
Give THTensor a constructor, use new/free. (#9496)
ezyang Jul 18, 2018
aa73348
added reminder of args naming rules to readme (#9504)
weiyangfb Jul 18, 2018
543d4af
Be strict prototypes clean. (#9516)
ezyang Jul 18, 2018
6de0382
Add random data filler to predictor bench to support production nets …
highker Jul 18, 2018
89db578
Fixed a typo
lewha0 Jul 18, 2018
73225e4
add docs for using `python setup.py clean` in developing mode (#9524)
fehiepsi Jul 18, 2018
3eb3f03
ROCm contributions week 28 (#9432)
iotamudelta Jul 18, 2018
5760821
Make squeeze doc consistent with it's behaviour (#9529)
albanD Jul 18, 2018
5eaed75
Implementing torch.isfinite (#9487)
bhushan23 Jul 18, 2018
f277645
Support N-dimensional empty tensors in CPU BLAS and (a selection of) …
gchanan Jul 18, 2018
8fe2622
Fix gatherTopK template (#9231)
malfet Jul 18, 2018
28954b9
Fix RoIAlignOp GPU implementation for RoIs without batch index (#9230)
Jul 18, 2018
07fb072
Merge remote-tracking branch 'upstream/master'
iotamudelta Jul 18, 2018
d6e124e
Dummy CircleCI config. (#9537)
ezyang Jul 18, 2018
35f7925
fix small literals being flushed to 0 by std::to_string
Jul 18, 2018
27455e9
Use _six for inf and nan (#9500)
ssnl Jul 18, 2018
b6b6e1b
Fix core.Plan.create_from_proto (#9438)
volkhin Jul 18, 2018
8c741b7
Add transformation from caffe2::resizeop to onnx::upsample
Jokeren Jul 18, 2018
ca3b36a
Add implementation for batch_moments_op (#9510)
xiaomengy Jul 18, 2018
c506ff9
Disable py2-clang3.8-rocmnightly-ubuntu16.04-test in disabled-configs…
ezyang Jul 18, 2018
8769fec
Move clamp into ATen (#9506)
cpuhrsch Jul 18, 2018
3b88650
Add CUDAGuard to ATen (#9277)
goldsborough Jul 18, 2018
4c615b1
Introduce libtorch to setup.py build (#8792)
anderspapitto Jul 18, 2018
c1ee883
Constructors and member functions for THStorage (#9357)
cpuhrsch Jul 18, 2018
04b33b7
Add byte_weight_dequant_op
Jokeren Jul 18, 2018
b3e141e
Add predictor config into Predictor (#9434)
Jul 18, 2018
604f7e9
Expose CAFFE2_USE_OPENCV preprocessor flag (#9509)
viswanathgs Jul 19, 2018
45f0d05
Adapt OnnxifiOp to removed suffix handling in ONNXIFI loader (#9571)
Jul 19, 2018
54db14e
HIP Operators Generator--> HipOpG (#9322)
petrex Jul 19, 2018
e0446fc
Pass dtype to tensor contructor in test_neg (#9558)
Jul 19, 2018
aee9e90
Fix TestAutograd.test_as_strided (#9538)
ssnl Jul 19, 2018
d4fa0e6
Merge remote-tracking branch 'rocm_upstream/master'
iotamudelta Jul 19, 2018
9af5625
Merge remote-tracking branch 'upstream/master'
iotamudelta Jul 19, 2018
f180373
Support n-dimensional empty tensors in CUDA BLAS and fix a btrifact b…
gchanan Jul 19, 2018
f33cd36
Use int64_t for im2col and col2im (#9590)
ssnl Jul 19, 2018
85b2816
quick patch for PackPadded removal to propagate the correct size. (#9…
anderspapitto Jul 19, 2018
6557856
Fix l2 normalization when handling zero vector (#9594)
bairdzhang Jul 19, 2018
f521823
Do not always set broadcast argument when exporting new onnx add and …
bddppq Jul 19, 2018
a08119a
Eliminate direct access to size/strides of THTensor; replace them wit…
ezyang Jul 19, 2018
bcf0bf4
Extend DispatchStub to support CUDA dispatch (#9579)
colesbury Jul 19, 2018
7e78e80
Make error message for empty module friendlier (#9565)
goldsborough Jul 19, 2018
b770156
Functional DataParallel (#9234)
goldsborough Jul 19, 2018
5651b27
Add CAFFE_STATIC_EVENT to Stats (#9501)
Jul 19, 2018
aa7af94
Make JIT tracing a thread-local property (#9414)
apaszke Jul 20, 2018
4028ff6
Revert "quick patch for PackPadded removal to propagate the correct s…
anderspapitto Jul 20, 2018
bfe2aa0
docs fixes (#9607)
ssnl Jul 20, 2018
2a0018f
Add scatter_add_ doc (#9630)
ssnl Jul 20, 2018
5de7af0
Merge remote-tracking branch 'upstream/master'
iotamudelta Jul 20, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 7 additions & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
version: 2
jobs:
build:
docker:
- image: circleci/python:3.7-node-browsers
steps:
- run: echo "hello world"
4 changes: 4 additions & 0 deletions .jenkins/caffe2/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ if [[ $BUILD_ENVIRONMENT == *rocm* ]]; then
export LANG=C.UTF-8
export LC_ALL=C.UTF-8
export HCC_AMDGPU_TARGET=gfx900

########## HIPIFY Caffe2 operators
${PYTHON} "${ROOT_DIR}/tools/amd_build/build_caffe2_amd.py"
fi

# Try to include Redis support for Linux builds
Expand Down Expand Up @@ -195,6 +198,7 @@ else
fi



###############################################################################
# Configure and make
###############################################################################
Expand Down
2 changes: 1 addition & 1 deletion .jenkins/pytorch/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -104,5 +104,5 @@ if [[ "$BUILD_TEST_LIBTORCH" == "1" ]]; then
echo "Building libtorch"
# NB: Install outside of source directory (at the same level as the root
# pytorch folder) so that it doesn't get cleaned away prior to docker push.
WERROR=1 VERBOSE=1 tools/cpp_build/build_all.sh "$PWD/../cpp-build"
WERROR=1 VERBOSE=1 tools/cpp_build/build_caffe2.sh "$PWD/../cpp-build"
fi
2 changes: 2 additions & 0 deletions .jenkins/pytorch/disabled-configs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@
# fail. You can use this to temporarily reserve a test name to
# turn on CI side before PyTorch repository supports it. This
# file has the same format as .jenkins/enabled-configs.txt

py2-clang3.8-rocmnightly-ubuntu16.04-test
4 changes: 2 additions & 2 deletions .jenkins/pytorch/macos-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,15 @@ test_cpp_api() {
CPP_BUILD="$PWD/../cpp-build"
rm -rf $CPP_BUILD
mkdir -p $CPP_BUILD
WERROR=1 VERBOSE=1 tools/cpp_build/build_all.sh "$CPP_BUILD"
WERROR=1 VERBOSE=1 tools/cpp_build/build_caffe2.sh "$CPP_BUILD"

python tools/download_mnist.py --quiet -d test/cpp/api/mnist

# Unfortunately it seems like the test can't load from miniconda3
# without these paths being set
export DYLD_LIBRARY_PATH="$DYLD_LIBRARY_PATH:$PWD/miniconda3/lib"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$PWD/miniconda3/lib"
"$CPP_BUILD"/libtorch/bin/test_api
"$CPP_BUILD"/caffe2/bin/test_api
}

if [ -z "${JOB_BASE_NAME}" ] || [[ "${JOB_BASE_NAME}" == *-test ]]; then
Expand Down
20 changes: 6 additions & 14 deletions .jenkins/pytorch/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,6 @@ source "$(dirname "${BASH_SOURCE[0]}")/common.sh"

echo "Testing pytorch"

if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
echo "Skipping ROCm tests for now"
exit 0
fi

# JIT C++ extensions require ninja.
git clone https://github.com/ninja-build/ninja --quiet
pushd ninja
Expand Down Expand Up @@ -49,13 +44,10 @@ if [[ "$BUILD_ENVIRONMENT" == *asan* ]]; then
(cd test && ! get_exit_code python -c "import torch; torch._C._crash_if_aten_asan(3)")
fi

export ATEN_DISABLE_AVX=
export ATEN_DISABLE_AVX2=
if [[ "${JOB_BASE_NAME}" == *-NO_AVX-* ]]; then
export ATEN_DISABLE_AVX=1
fi
if [[ "${JOB_BASE_NAME}" == *-NO_AVX2-* ]]; then
export ATEN_DISABLE_AVX2=1
export ATEN_CPU_CAPABILITY=default
elif [[ "${JOB_BASE_NAME}" == *-NO_AVX2-* ]]; then
export ATEN_CPU_CAPABILITY=avx
fi

test_python_nn() {
Expand Down Expand Up @@ -104,12 +96,12 @@ test_libtorch() {
echo "Testing libtorch"
CPP_BUILD="$PWD/../cpp-build"
if [[ "$BUILD_ENVIRONMENT" == *cuda* ]]; then
"$CPP_BUILD"/libtorch/bin/test_jit
"$CPP_BUILD"/caffe2/bin/test_jit
else
"$CPP_BUILD"/libtorch/bin/test_jit "[cpu]"
"$CPP_BUILD"/caffe2/bin/test_jit "[cpu]"
fi
python tools/download_mnist.py --quiet -d test/cpp/api/mnist
OMP_NUM_THREADS=2 "$CPP_BUILD"/libtorch/bin/test_api
OMP_NUM_THREADS=2 "$CPP_BUILD"/caffe2/bin/test_api
fi
}

Expand Down
2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ endif()
# Note to developers: if you add an option below, make sure you also add it to
# cmake/Summary.cmake so that the summary prints out the option values.
include(CMakeDependentOption)
option(BUILD_TORCH "Build Torch" OFF)
option(BUILD_CAFFE2 "Build Caffe2" ON)
option(BUILD_ATEN "Build ATen" OFF)
option(BUILD_BINARY "Build C++ binaries" ON)
Expand Down Expand Up @@ -214,6 +215,7 @@ if(NOT MSVC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-variable")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-function")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-unused-result")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-strict-overflow")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-strict-aliasing")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-error=deprecated-declarations")
# These flags are not available in GCC-4.8.5. Set only when using clang.
Expand Down
7 changes: 4 additions & 3 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@ For example:

You do not need to repeatedly install after modifying python files.

In case you want to reinstall, make sure that you uninstall pytorch first by running `pip uninstall torch`
and `python setup.py clean`. Then you can install in `build develop` mode again.

## Unit testing

PyTorch's testing is located under `test/`. Run the entire test suite with
Expand Down Expand Up @@ -146,9 +149,7 @@ working on:

- Working on `torch/lib` and want to run your changes / rerun cmake? Run
`python setup.py build_deps`. Note that this will rerun cmake for
every subdirectory in TH; if you are only working on one project,
consider editing `torch/lib/build_all.sh` and commenting out the
`build` lines of libraries you are not working on.
every subdirectory in TH.

On the initial build, you can also speed things up with the environment
variables `DEBUG` and `NO_CUDA`.
Expand Down
12 changes: 9 additions & 3 deletions aten/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -80,14 +80,20 @@ add_subdirectory(src/TH)
set(TH_CPU_INCLUDE
# dense
${CMAKE_CURRENT_SOURCE_DIR}/src/TH
${CMAKE_CURRENT_SOURCE_DIR}/src/THC
${CMAKE_CURRENT_BINARY_DIR}/src/TH
${CMAKE_CURRENT_BINARY_DIR}/src/THC

${CMAKE_CURRENT_SOURCE_DIR}/src
${CMAKE_CURRENT_BINARY_DIR}/src
${CMAKE_BINARY_DIR}/aten/src)
list(APPEND ATen_CPU_INCLUDE ${TH_CPU_INCLUDE})

if(USE_CUDA OR USE_ROCM)
set(TH_CUDA_INCLUDE
# dense
${CMAKE_CURRENT_SOURCE_DIR}/src/THC
${CMAKE_CURRENT_BINARY_DIR}/src/THC)
list(APPEND ATen_CUDA_INCLUDE ${TH_CUDA_INCLUDE})
endif()

add_subdirectory(src/THNN)

# Find the HIP package, set the HIP paths, load the HIP CMake.
Expand Down
1 change: 1 addition & 0 deletions aten/src/ATen/ATen.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@
#include "ATen/TensorOptions.h"
#include "ATen/Layout.h"
#include "ATen/OptionsGuard.h"
#include "ATen/CUDAGuard.h"
3 changes: 3 additions & 0 deletions aten/src/ATen/Allocator.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ class DataPtr {
DataPtr(void* data, void* ctx, DeleterFnPtr ctx_deleter, Device device)
: ptr_(data, ctx, ctx_deleter), device_(device) {}
void* operator->() const { return ptr_.get(); }
void clear() {
ptr_.clear();
}
void* get() const { return ptr_.get(); }
void* get_context() const { return ptr_.get_context(); }
void* release_context() { return ptr_.release_context(); }
Expand Down
110 changes: 110 additions & 0 deletions aten/src/ATen/CUDAGuard.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
#pragma once

#include <ATen/ArrayRef.h>
#include <ATen/CUDAStream.h>
#include <ATen/Context.h>
#include <ATen/DeviceGuard.h>

#include <cstddef>
#include <vector>

namespace at {

/// A variant of `DeviceGuard` that augments it with an understanding of CUDA
/// streams. This guard can not only set and reset the current CUDA device, but
/// also set and reset the current CUDA stream. It is important to note that
/// because a CUDA stream is intrinsically associated with the CUDA device to
/// which it is bound, setting the CUDA stream *also* sets the current CUDA
/// device to that of the stream.
struct CUDAGuard {
/// Default constructor, does nothing and causes no change in the current
/// stream or device until `set_stream` or `set_device` is called.
CUDAGuard() = default;

/// Sets the CUDA stream and its associated device as the current one (calls
/// `set_stream`).
explicit CUDAGuard(const CUDAStream& stream) {
set_stream(stream);
}

/// Calls `set_device` with the given index.
explicit CUDAGuard(int32_t device) {
set_device(device);
}

CUDAGuard(const CUDAGuard&) = delete;
CUDAGuard& operator=(const CUDAGuard&) = delete;

/// Move-constructs this `CUDAGuard` from another `CUDAGuard`. The
/// moved-from `CUDAGuard` is modified such that its destruction has no
/// effect (does not reset the stream or device).
CUDAGuard(CUDAGuard&& other) noexcept = default;

/// Move-assigns this `CUDAGuard` from another `CUDAGuard`. The
/// moved-from `CUDAGuard` is modified such that its destruction has no
/// effect (does not reset the stream or device).
CUDAGuard& operator=(CUDAGuard&& other) {
device_guard_ = std::move(other.device_guard_);
original_streams_ = std::move(other.original_streams_);
other.original_streams_.clear();
return *this;
}

/// Resets the CUDA stream on each device to the one that was active upon
/// construction.
~CUDAGuard() {
if (!original_streams_.empty()) {
for (size_t device = 0; device < original_streams_.size(); ++device) {
globalContext().uncheckedSetCurrentCUDAStreamOnDevice(
device, original_streams_[device]);
}
}
}

/// Sets the current CUDA device to the device associated with the given
/// stream, and then sets the current stream on that device to the one given.
void set_stream(const CUDAStream& stream) {
device_guard_.set_index(stream.device());
// If we haven't stored the current stream yet, store it now.
if (original_streams_.empty()) {
const size_t device_count = globalContext().getNumGPUs();
original_streams_.reserve(device_count);
for (size_t device = 0; device < device_count; ++device) {
original_streams_.push_back(
globalContext().getCurrentCUDAStreamOnDevice(device));
}
}
globalContext().setCurrentCUDAStreamOnDevice(
device_guard_.last_index(), stream);
}

/// Sets the CUDA device to the given one.
void set_device(int32_t device) {
device_guard_.set_index(device);
}

/// Returns the CUDA streams that were active in the first call to
/// `set_stream`. If there was no such call, the returned container is
/// empty.
ArrayRef<CUDAStream> original_streams() const noexcept {
return original_streams_;
}

/// Returns the device that was set upon construction of the guard.
int32_t original_device() const noexcept {
return device_guard_.original_index();
}

/// Returns the last device that was set via `set_device`, if any.
int32_t last_device() const noexcept {
return device_guard_.last_index();
}

private:
/// The guard for the current device.
DeviceGuard device_guard_;
/// The original streams that were active on all devices.
std::vector<CUDAStream> original_streams_;
};

} // namespace at
Loading