Skip to content

Migrate CPU Backend to allocate memory at runtime. #1943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 2, 2018

Conversation

gcatron
Copy link
Contributor

@gcatron gcatron commented Oct 31, 2018

Description: Migrate CPU Backend to allocate and initialize memory at runtime. Added a new runtimeInfo object that contains information needed for runtime memory allocation.
Testing: ninja test passes, AOT bundles still work, testing with run.sh --iterations=20 -cpu -time showed minimal performance difference.
Documentation: N/A
Progress on #1904

Copy link
Contributor

@nadavrot nadavrot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for doing this work @gcatron . I am excited to see the fast development on this PR. I made a few comments in the code, mainly regarding comments and style.

I asked a few questions, such as "is it possible to forward declare a class". if the answer is "no" then feel free to ignore the question and move on.

Please make sure to run the ./run.sh tests with the --iterations=10 and -cpu flags and check that there are no noticeable performance regressions. I expect none but we need to be sure.

Also, please make sure to squash the PR down to one or two commits, or whatever is reasonable.

@opti-mix is the expert in this part of the project, so if he approved the design then feel free to land (after fixing the style issues).

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with this code path :)

Question:
Do you allocate memory for weights and activations every time on inference request?

I think longer term we could keep activations in thread local memory, facilitate thread safety and avoidance of allocation/deallocation overhead.

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you test this change :)?

Copy link
Contributor

@qcolombet qcolombet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable as a first step.
I believe with want to be more aggressive on the refactoring going forward.

@rdzhabarov
Copy link
Contributor

Have you verified that AOT still works :)?

Copy link
Contributor

@rdzhabarov rdzhabarov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have information about JIT under /docs in the repo, could you check if anything needs to be updated there?

Looking good.

Copy link
Contributor

@opti-mix opti-mix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! Looking good!

@gcatron gcatron force-pushed the cpu_backend_allocation_move branch from 44c1a35 to e320fe6 Compare November 2, 2018 18:30
Created new RuntimeBundle to pass information forward from compile time to runtime
@gcatron gcatron force-pushed the cpu_backend_allocation_move branch from c5a126e to d32e0e8 Compare November 2, 2018 18:33
@gcatron gcatron merged commit 4f9816a into pytorch:master Nov 2, 2018
@nadavrot
Copy link
Contributor

nadavrot commented Nov 2, 2018

Congrats! Nice work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants