Skip to content

Integrate from upstream #254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Oct 9, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
83b4dc6
Remove Type.tensor(). (#12360)
gchanan Oct 8, 2018
00aedfc
constant pooling pass (#12222)
Oct 8, 2018
f1f521f
make bench_gen.py work for 3d conv (#12433)
jspark1105 Oct 8, 2018
e7653c7
New chaining/partitioning algorithm for async_scheduling for inferenc…
Oct 8, 2018
7103d0d
Add python bindings (#12253)
bwasti Oct 8, 2018
cf2b88f
Induce edges on subgraphs (#12255)
bwasti Oct 8, 2018
d181e0f
Add move{Node,Edge,Subgraph} for Graph move-like semantics (#12303)
bwasti Oct 8, 2018
a55b9f7
Implement 3D and 4D parallelization in Caffe2 thread pool (#12455)
Oct 8, 2018
d4b4c1f
Add missing url links to README.md file. (#12440)
marcemq Oct 8, 2018
5bac465
Fix TestJit.test_alexnet expect file (#12458)
Oct 8, 2018
d0e1dca
fix expect file (#12465)
Oct 8, 2018
c3987a0
Fix issues with ATenOp handling methods where `self` is not the first…
Oct 8, 2018
c5d7494
Use open-source NCCL2 in PyTorch (#12359)
teng-li Oct 8, 2018
dd4b9b0
Back out "Back out "[caffe2] Use custom CPU thread pool in async_sche…
Oct 8, 2018
1ee6fc4
Delete noexcept on the move constructor of OrderedDict (#12369)
ezyang Oct 8, 2018
5a0d2c7
Add clamping functionality to stats_put_ops
BlueberryDS Oct 8, 2018
cdead5a
Enable CircleCI for Linux jobs (#12389)
Oct 9, 2018
d400502
Fix a bunch of warnings in TestNN
ssnl Oct 9, 2018
8414094
cleanup controlflow (#12235)
bwasti Oct 9, 2018
c959be9
Create named functions construct (#12237)
bwasti Oct 9, 2018
1a0d82e
fix import for script module with control flow blocks (#12351)
Oct 9, 2018
ca71c11
Merge remote-tracking branch 'rocm_upstream/upstream' into ifu
iotamudelta Oct 9, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
693 changes: 377 additions & 316 deletions .circleci/config.yml

Large diffs are not rendered by default.

3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,6 @@
[submodule "third_party/ideep"]
path = third_party/ideep
url = https://github.com/intel/ideep
[submodule "third_party/nccl/nccl"]
path = third_party/nccl/nccl
url = https://github.com/NVIDIA/nccl
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ You get the best of speed and flexibility for your crazy research.

PyTorch is not a Python binding into a monolithic C++ framework.
It is built to be deeply integrated into Python.
You can use it naturally like you would use NumPy / SciPy / scikit-learn etc.
You can use it naturally like you would use [NumPy](http://www.numpy.org/) / [SciPy](https://www.scipy.org/) / [scikit-learn](http://scikit-learn.org) etc.
You can write your new neural network layers in Python itself, using your favorite libraries
and use packages such as Cython and Numba.
Our goal is to not reinvent the wheel where appropriate.
Expand All @@ -104,7 +104,7 @@ We hope you never spend hours debugging your code because of bad stack traces or
### Fast and Lean

PyTorch has minimal framework overhead. We integrate acceleration libraries
such as Intel MKL and NVIDIA (cuDNN, NCCL) to maximize speed.
such as [Intel MKL](https://software.intel.com/mkl) and NVIDIA (cuDNN, NCCL) to maximize speed.
At the core, its CPU and GPU Tensor and neural network backends
(TH, THC, THNN, THCUNN) are mature and have been tested for years.

Expand Down Expand Up @@ -226,7 +226,7 @@ should increase shared memory size either with `--ipc=host` or `--shm-size` comm

### Building the Documentation

To build documentation in various formats, you will need Sphinx and the
To build documentation in various formats, you will need [Sphinx](http://www.sphinx-doc.org) and the
readthedocs theme.

```
Expand Down
2 changes: 1 addition & 1 deletion aten/src/ATen/function_wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -780,7 +780,7 @@ def emit_nn_body(option):
# _out variants must create buffers and insert them in the
# arguments list between output and input arguments
for buffer in option['buffers']:
body.append('Tensor {} = tensor();'.format(buffer['name']))
body.append('Tensor {} = at::empty({{0}}, this->options());'.format(buffer['name']))
actuals = [arg['name'] for arg in option['arguments'] if arg.get('output')]
actuals += [buffer['name'] for buffer in option['buffers']]
actuals += [arg['name'] for arg in option['arguments'] if not arg.get('output')]
Expand Down
3 changes: 0 additions & 3 deletions aten/src/ATen/native/native_functions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1901,9 +1901,6 @@
SparseCPU: new_with_size_sparse
SparseCUDA: new_with_size_sparse

- func: tensor(Type dtype) -> Tensor
variants: []

- func: tensor(Type dtype, IntList size) -> Tensor
variants: []

Expand Down
10 changes: 4 additions & 6 deletions binaries/bench_gen/bench_gen.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from __future__ import unicode_literals

import argparse
import ast

from caffe2.python.model_helper import ModelHelper
from caffe2.python.predictor import mobile_exporter
Expand All @@ -15,18 +16,15 @@
def parse_kwarg(kwarg_str):
key, value = kwarg_str.split('=')
try:
value = int(value)
value = ast.literal_eval(value)
except ValueError:
try:
value = float(value)
except ValueError:
pass
pass
return key, value


def main(args):
# User defined keyword arguments
kwargs = {"order": "NCHW"}
kwargs = {"order": "NCHW", "use_cudnn": False}
kwargs.update(dict(args.kwargs))

model = ModelHelper(name=args.benchmark_name)
Expand Down
42 changes: 42 additions & 0 deletions c10/test/registry_test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ class Foo {
explicit Foo(int x) {
// LOG(INFO) << "Foo " << x;
}
virtual ~Foo() {}
};

C10_DECLARE_REGISTRY(FooRegistry, Foo, int);
Expand Down Expand Up @@ -46,4 +47,45 @@ TEST(RegistryTest, ReturnNullOnNonExistingCreator) {
EXPECT_EQ(FooRegistry()->Create("Non-existing bar", 1), nullptr);
}

// C10_REGISTER_CLASS_WITH_PRIORITY defines static variable
void RegisterFooDefault() {
C10_REGISTER_CLASS_WITH_PRIORITY(
FooRegistry, FooWithPriority, c10::REGISTRY_DEFAULT, Foo);
}

void RegisterFooDefaultAgain() {
C10_REGISTER_CLASS_WITH_PRIORITY(
FooRegistry, FooWithPriority, c10::REGISTRY_DEFAULT, Foo);
}

void RegisterFooBarFallback() {
C10_REGISTER_CLASS_WITH_PRIORITY(
FooRegistry, FooWithPriority, c10::REGISTRY_FALLBACK, Bar);
}

void RegisterFooBarPreferred() {
C10_REGISTER_CLASS_WITH_PRIORITY(
FooRegistry, FooWithPriority, c10::REGISTRY_PREFERRED, Bar);
}

TEST(RegistryTest, RegistryPriorities) {
FooRegistry()->SetTerminate(false);
RegisterFooDefault();

// throws because Foo is already registered with default priority
EXPECT_THROW(RegisterFooDefaultAgain(), std::runtime_error);

#ifdef __GXX_RTTI
// not going to register Bar because Foo is registered with Default priority
RegisterFooBarFallback();
std::unique_ptr<Foo> bar1(FooRegistry()->Create("FooWithPriority", 1));
EXPECT_EQ(dynamic_cast<Bar*>(bar1.get()), nullptr);

// will register Bar because of higher priority
RegisterFooBarPreferred();
std::unique_ptr<Foo> bar2(FooRegistry()->Create("FooWithPriority", 1));
EXPECT_NE(dynamic_cast<Bar*>(bar2.get()), nullptr);
#endif
}

} // namespace c10_test
96 changes: 83 additions & 13 deletions c10/util/Registry.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,21 @@
namespace c10 {

template <typename KeyType>
inline void PrintOffendingKey(const KeyType& /*key*/) {
printf("[key type printing not supported]\n");
inline std::string KeyStrRepr(const KeyType& /*key*/) {
return "[key type printing not supported]";
}

template <>
inline void PrintOffendingKey(const std::string& key) {
printf("Offending key: %s.\n", key.c_str());
inline std::string KeyStrRepr(const std::string& key) {
return key;
}

enum RegistryPriority {
REGISTRY_FALLBACK = 1,
REGISTRY_DEFAULT = 2,
REGISTRY_PREFERRED = 3,
};

/**
* @brief A template class that allows one to register classes by keys.
*
Expand All @@ -48,9 +54,12 @@ class Registry {
public:
typedef std::function<ObjectPtrType(Args...)> Creator;

Registry() : registry_() {}
Registry() : registry_(), priority_(), terminate_(true) {}

void Register(const SrcType& key, Creator creator) {
void Register(
const SrcType& key,
Creator creator,
const RegistryPriority priority = REGISTRY_DEFAULT) {
std::lock_guard<std::mutex> lock(register_mutex_);
// The if statement below is essentially the same as the following line:
// CHECK_EQ(registry_.count(key), 0) << "Key " << key
Expand All @@ -59,18 +68,40 @@ class Registry {
// carried out at static initialization time, we do not want to have an
// explicit dependency on glog's initialization function.
if (registry_.count(key) != 0) {
printf("Key already registered.\n");
PrintOffendingKey(key);
std::exit(1);
auto cur_priority = priority_[key];
if (priority > cur_priority) {
std::string warn_msg =
"Overwriting already registered item for key " + KeyStrRepr(key);
fprintf(stderr, "%s\n", warn_msg.c_str());
registry_[key] = creator;
priority_[key] = priority;
} else if (priority == cur_priority) {
std::string err_msg =
"Key already registered with the same priority: " + KeyStrRepr(key);
fprintf(stderr, "%s\n", err_msg.c_str());
if (terminate_) {
std::exit(1);
} else {
throw std::runtime_error(err_msg);
}
} else {
std::string warn_msg =
"Higher priority item already registered, skipping registration of " +
KeyStrRepr(key);
fprintf(stderr, "%s\n", warn_msg.c_str());
}
} else {
registry_[key] = creator;
priority_[key] = priority;
}
registry_[key] = creator;
}

void Register(
const SrcType& key,
Creator creator,
const std::string& help_msg) {
Register(key, creator);
const std::string& help_msg,
const RegistryPriority priority = REGISTRY_DEFAULT) {
Register(key, creator, priority);
help_message_[key] = help_msg;
}

Expand Down Expand Up @@ -109,8 +140,16 @@ class Registry {
return it->second.c_str();
}

// Used for testing, if terminate is unset, Registry throws instead of
// calling std::exit
void SetTerminate(bool terminate) {
terminate_ = terminate;
}

private:
std::unordered_map<SrcType, Creator> registry_;
std::unordered_map<SrcType, RegistryPriority> priority_;
bool terminate_;
std::unordered_map<SrcType, std::string> help_message_;
std::mutex register_mutex_;

Expand All @@ -120,14 +159,23 @@ class Registry {
template <class SrcType, class ObjectPtrType, class... Args>
class Registerer {
public:
Registerer(
explicit Registerer(
const SrcType& key,
Registry<SrcType, ObjectPtrType, Args...>* registry,
typename Registry<SrcType, ObjectPtrType, Args...>::Creator creator,
const std::string& help_msg = "") {
registry->Register(key, creator, help_msg);
}

explicit Registerer(
const SrcType& key,
const RegistryPriority priority,
Registry<SrcType, ObjectPtrType, Args...>* registry,
typename Registry<SrcType, ObjectPtrType, Args...>::Creator creator,
const std::string& help_msg = "") {
registry->Register(key, creator, help_msg, priority);
}

template <class DerivedType>
static ObjectPtrType DefaultCreator(Args... args) {
return ObjectPtrType(new DerivedType(args...));
Expand Down Expand Up @@ -187,13 +235,27 @@ class Registerer {
static Registerer##RegistryName C10_ANONYMOUS_VARIABLE(g_##RegistryName)( \
key, RegistryName(), ##__VA_ARGS__);

#define C10_REGISTER_TYPED_CREATOR_WITH_PRIORITY( \
RegistryName, key, priority, ...) \
static Registerer##RegistryName C10_ANONYMOUS_VARIABLE(g_##RegistryName)( \
key, priority, RegistryName(), ##__VA_ARGS__);

#define C10_REGISTER_TYPED_CLASS(RegistryName, key, ...) \
static Registerer##RegistryName C10_ANONYMOUS_VARIABLE(g_##RegistryName)( \
key, \
RegistryName(), \
Registerer##RegistryName::DefaultCreator<__VA_ARGS__>, \
::c10::demangle_type<__VA_ARGS__>());

#define C10_REGISTER_TYPED_CLASS_WITH_PRIORITY( \
RegistryName, key, priority, ...) \
static Registerer##RegistryName C10_ANONYMOUS_VARIABLE(g_##RegistryName)( \
key, \
priority, \
RegistryName(), \
Registerer##RegistryName::DefaultCreator<__VA_ARGS__>, \
::c10::demangle_type<__VA_ARGS__>());

// C10_DECLARE_REGISTRY and C10_DEFINE_REGISTRY are hard-wired to use
// std::string as the key type, because that is the most commonly used cases.
#define C10_DECLARE_REGISTRY(RegistryName, ObjectType, ...) \
Expand All @@ -218,9 +280,17 @@ class Registerer {
#define C10_REGISTER_CREATOR(RegistryName, key, ...) \
C10_REGISTER_TYPED_CREATOR(RegistryName, #key, __VA_ARGS__)

#define C10_REGISTER_CREATOR_WITH_PRIORITY(RegistryName, key, priority, ...) \
C10_REGISTER_TYPED_CREATOR_WITH_PRIORITY( \
RegistryName, #key, priority, __VA_ARGS__)

#define C10_REGISTER_CLASS(RegistryName, key, ...) \
C10_REGISTER_TYPED_CLASS(RegistryName, #key, __VA_ARGS__)

#define C10_REGISTER_CLASS_WITH_PRIORITY(RegistryName, key, priority, ...) \
C10_REGISTER_TYPED_CLASS_WITH_PRIORITY( \
RegistryName, #key, priority, __VA_ARGS__)

} // namespace c10

#endif // C10_UTIL_REGISTRY_H_
13 changes: 4 additions & 9 deletions caffe2/contrib/aten/gen_op.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,12 +237,7 @@ def find_factory_methods(decls):
}
defined_inferred_type = False

if 'Tensor' in o['method_of']:
# make sure 'self' is the first argument. currently Declarations.yaml
# does not always do this. Instead it keeps the argument list the same order
# as the Type method.
o['arguments'] = self_as_first_argument(o['arguments'])
elif 'namespace' not in o['method_of']:
if 'namespace' not in o['method_of'] and 'Tensor' not in o['method_of']:
# methods on type like 'ones' or 'zeros' always take a
# string attribute that is translated into the at::Type object
# e.g. "Float" is at::kFloat
Expand Down Expand Up @@ -289,11 +284,11 @@ def find_factory_methods(decls):
assignment = CT(t).substitute(env, offset=i, output=get_output(o, i))
env['assignments'].append(assignment)

if 'Tensor' in o['method_of']:
if 'namespace' in o['method_of']:
env['invocation'] = CT("at::${name}(${arguments})").substitute(env)
elif 'Tensor' in o['method_of']:
env['invocation'] = "self.{}({})".format(
o['name'], ', '.join(env['arguments'][1:]))
elif 'namespace' in o['method_of']:
env['invocation'] = CT("at::${name}(${arguments})").substitute(env)
else:
assert('Type' in o['method_of'])
env['invocation'] = CT(
Expand Down
2 changes: 1 addition & 1 deletion caffe2/core/hip/net_async_hip_thread_pool_hip.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ C10_DEFINE_int(

namespace caffe2 {

std::shared_ptr<TaskThreadPool>
std::shared_ptr<TaskThreadPoolBase>
GetAsyncNetHIPThreadPool(int hip_gpu_id, int pool_size, bool create_new) {
// For GPU, use per device thread pools of predefined constant size
if (pool_size != c10::FLAGS_caffe2_threads_per_hip_gpu) {
Expand Down
2 changes: 1 addition & 1 deletion caffe2/core/net.cc
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ unique_ptr<NetBase> CreateNet(
return net;
}

TaskThreadPool* ExecutorHelper::GetPool(
TaskThreadPoolBase* ExecutorHelper::GetPool(
const DeviceOption& /* unused */) const {
CAFFE_THROW("Not implemented");
}
Expand Down
2 changes: 1 addition & 1 deletion caffe2/core/net.h
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ class CAFFE2_API NetBase : public Observable<NetBase> {
class CAFFE2_API ExecutorHelper {
public:
ExecutorHelper() {}
virtual TaskThreadPool* GetPool(const DeviceOption& option) const;
virtual TaskThreadPoolBase* GetPool(const DeviceOption& option) const;
virtual ~ExecutorHelper() {}
};

Expand Down
Loading