You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great work! PS design doc says embedding parameters are replicated across servers and synchronization is needed for consistency. I have some concerns about the synchronization overhead since embedding layers are usually huge. For example, Facebook mentioned that their embedding tables of production may be terabytes in size (paper link).
The text was updated successfully, but these errors were encountered:
Great work! PS design doc says embedding parameters are replicated across servers and synchronization is needed for consistency. I have some concerns about the synchronization overhead since embedding layers are usually huge. For example, Facebook mentioned that their embedding tables of production may be terabytes in size (paper link).
Yes, there is the synchronization overhead of huge embedding layers. So, we can adjust the staleness in the gradient update to improve efficiency. For example, the ps can drop the gradient version is stale.
Great work! PS design doc says embedding parameters are replicated across servers and synchronization is needed for consistency. I have some concerns about the synchronization overhead since embedding layers are usually huge. For example, Facebook mentioned that their embedding tables of production may be terabytes in size (paper link).
The text was updated successfully, but these errors were encountered: