Skip to content

a little question about embedding parameters synchronization #2519

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jasperzhong opened this issue Apr 21, 2021 · 1 comment
Closed

a little question about embedding parameters synchronization #2519

jasperzhong opened this issue Apr 21, 2021 · 1 comment

Comments

@jasperzhong
Copy link

Great work! PS design doc says embedding parameters are replicated across servers and synchronization is needed for consistency. I have some concerns about the synchronization overhead since embedding layers are usually huge. For example, Facebook mentioned that their embedding tables of production may be terabytes in size (paper link).

@workingloong
Copy link
Collaborator

Great work! PS design doc says embedding parameters are replicated across servers and synchronization is needed for consistency. I have some concerns about the synchronization overhead since embedding layers are usually huge. For example, Facebook mentioned that their embedding tables of production may be terabytes in size (paper link).

Yes, there is the synchronization overhead of huge embedding layers. So, we can adjust the staleness in the gradient update to improve efficiency. For example, the ps can drop the gradient version is stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants