You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Remove the current limitation of 1 Inferentia ASIC per API replica. We're currently forced to go with only one because of some issue in the Neuron RTD.
Motivation
It will allow to partition models across multiple Inferentia ASICs.
Description
Remove the current limitation of 1 Inferentia ASIC per API replica. We're currently forced to go with only one because of some issue in the Neuron RTD.
Motivation
It will allow to partition models across multiple Inferentia ASICs.
Additional context
As reported in aws-neuron/aws-neuron-sdk#110.
The text was updated successfully, but these errors were encountered: