-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Is your feature request related to a problem? Please describe.
Executing the barracuda models is the biggest performance cost in my project. Using 3 layer by 512 node models, here are the barracuda ModelRunner execution times with different models exported to the .nn file:
- 4.00 ms .nn file with "Action", "Action_probs", "Value_Estimate"
- 2.65 ms .nn file with "Action", Action_probs"
- 2.00 ms .nn file with "Action"
Describe the solution you'd like
When using Barracuda models in inference mode, it is a huge performance savings to be executing leaner .nn files with fewer models in them. Please provide an option to control what gets exported to the .nn files so that it is possible to export the leanest possible .nn file.
Is the "action_probs" model needed in inference mode? Looking at the code in ModelRunner it does not appear that the output from the "action_probs" output is retrieved. Is this model needed by the "action" model somehow? In my testing, the .nn files appear to execute without this model.
Describe alternatives you've considered
Additional context