Skip to content

Enable LLVM and Glow assertions in Release mode #2737

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mssimpso opened this issue Apr 17, 2019 · 15 comments
Closed

Enable LLVM and Glow assertions in Release mode #2737

mssimpso opened this issue Apr 17, 2019 · 15 comments

Comments

@mssimpso
Copy link

It would be great if LLVM and Glow assertions were enabled by default in Release mode for continuous integration builds. We enable these internally and frequently hit them when merging in upstream changes.

@nadavrot
Copy link
Contributor

I like this idea. Do you have a patch that enables assertions already? Also, have you measured tthe impact on the test suite runtime? cc: @rdzhabarov

@chadrosier
Copy link
Contributor

Can't provide a patch at the moment (waiting on corporate to sign CLA), but one can be easily derived from the LLVM project. The impact on the test suite would be entirely compile-time; we've not measured it, but happy to do so.

@rdzhabarov
Copy link
Contributor

rdzhabarov commented Apr 17, 2019

Glow assertions were enabled by default in Release mode for continuous integration builds

That's already part of what we run in our "debug" circleci build.

As for LLVM it would be an interesting idea.

would be entirely compile-time;

That would require building LLVM during the ci runs (unless it's prebuild in the docker image) which could be time consuming.
I'd also like to see how much overhead would that bring. Obviously, I'd like to keep our CI time within a sane limits.

@geoffberry
Copy link
Contributor

I understand not wanting to do an LLVM build in your ci runs. We use a prebuilt LLVM release build with assertions enabled, and have workarounds for a handful of LLVM asserts in our Glow internal codebase currently.

@rdzhabarov
Copy link
Contributor

I think it would make sense to either install that into docker container or download from S3.

@geoffberry how do you run that on internal CI?

@geoffberry
Copy link
Contributor

@rdzhabarov I'm not sure what you're asking. Do you mean how do we link with the prebuilt version of LLVM that we are using, or what tests are we running in CI, or something else?

@rdzhabarov
Copy link
Contributor

I'm curious about your CI setup if any

@chadrosier
Copy link
Contributor

Our CI is configured with asserts enabled in release mode for both LLVM and Glow. We use a prebuilt version of LLVM and protobuf and don't perform any type of containerization. With each commit we run all of the unit tests along with the image-classifier application for all Caffe2 inputs and for all quantization schemes.

@chadrosier
Copy link
Contributor

Attaching results from running the Glow integrated tests with asserts on and off for Glow/LLVM.
LLVM version: 7.0.1
Glow: 71f2ec2

int_tests_asserts_off.txt
int_tests_asserts_on.txt

Also, here are a few examples of places we've identified that need fixes due to unchecked LLVM errors:

llvm::Error runErr = llvm::Error::success();

llvm::Error addErr = llvm::Error::success();

llvm::Error addErr = llvm::Error::success();

llvm::Error removeErr = llvm::Error::success();

llvm::Error runErr = llvm::Error::success();

llvm::Error runErr = llvm::Error::success();

llvm::Error runErr = llvm::Error::success();

runErr = llvm::Error::success();

IIUC, these llvm::Error values need to be checked (i.e., converted to a boolean) for the checked bit to be set in order to avoid an LLVM assertion.

We're running into other types of asserts, but the above provide good examples of what we're seeing.

@bertmaher
Copy link
Contributor

Interesting, are there Glow assertions that fail when running our "RELEASE" CI test suite?

My main hesitation in turning on Glow assertions for release runs is that they make the large tests super-duper slow (last time I looked it was about a 100x slowdown to run ResNet-50 on the interpreter, because we check every tensor dim for every FLOP in every convolution, but maybe it's better than that now).

@chadrosier
Copy link
Contributor

chadrosier commented Jun 4, 2019

I don't recall running into any Glow asserts off the top of my head (and a quick grep didn't find anything). Just having asserts enabled for LLVM in release mode would likely solve most of our problems. Assuming the Glow asserts are enabled in the debug build, I don't think it needs to be enabled in the Release build.

@chadrosier
Copy link
Contributor

New unchecked error:

auto function = backend->compileFunctions(functions, opts);

@jackm321
Copy link
Contributor

jackm321 commented Jun 5, 2019

@rosierm thanks for raising this.
I have previously built llvm with assertions on and eliminated these problems in the past but they will of course continue to reappear if we don't have testing for it.
@rdzhabarov do you know how feasible it would be to modify the llvm build we have in CI to do this?

@rdzhabarov
Copy link
Contributor

@rdzhabarov do you know how feasible it would be to modify the llvm build we have in CI to do this?

@jackm321
This is all very doable. Might be similar to how we added llvm8 docker image.
Let's sync up on this in person.

The only concern would be compilation + execution time. If that could be reasonable we should be able to make this as a CI job.

@chadrosier
Copy link
Contributor

Two new unchecked error in this test:

TEST_F(HostManagerTest, ConfigureHostManager) {

facebook-github-bot pushed a commit that referenced this issue Jun 14, 2019
Summary:
llvm::Errors should always be checked and in debug mode will assert if they aren't checked.
There a several unchecked llvm::Errors in Glow, mostly in tests due to a workaround for MSVC which doesn't like `std::promise<llvm::Error>`s
This PR should fix the existing issues but we still need to setup CI to check this on an ongoing basis.
See #2737 for more details.

Documentation:
n/a
Pull Request resolved: #3097

Differential Revision: D15828849

Pulled By: jackm321

fbshipit-source-id: fa601cbc7b4dbe8a86dcf56c293d032f4cde497b
facebook-github-bot pushed a commit that referenced this issue Sep 13, 2019
Summary:
llvm::Errors should always be checked and with an LLVM asserts build will assert if they aren't checked.
It would be nice to set up a CI to check this on an ongoing basis.
See #2737 for more details.
Pull Request resolved: #3506

Differential Revision: D17351349

Pulled By: rdzhabarov

fbshipit-source-id: 5d1a1cc9f155e6663d5d8d397539fb0d1d527301
jackm321 added a commit to jackm321/glow that referenced this issue Sep 17, 2019
Summary:
_Note for review: Most real changes are in `Error.h`, `Error.cpp`, and `ErrorTest.cpp` and the rest is mostly codemod to switch from `llvm::Error/Expected` to `glow::Error/Expected`._
Removes dependency on `llvm::Error` and ensures Errors are don't go unchecked by verifying this in debug build.

`glow::Error/Expected` are mostly drop-in replacements for `llvm::Error/Expected` but exclude some features not used in Glow such as multiple error value types, reference types contained in Expected, etc. They also have a couple of different features like `Error::empty()` which creates "pre-checked" `Error`s.

Documentation:
doxygen
fixes: pytorch#2737, pytorch#3505
Pull Request resolved: pytorch#3515

Test Plan:
added `ErrorTest` unit tests
ninja check

Differential Revision: D17399297

fbshipit-source-id: 4b0a524e2c1ba2cf4adc766a0e8196ee94e062f0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants