diff --git a/docs/Getting-Started-with-Balance-Ball.md b/docs/Getting-Started-with-Balance-Ball.md index dbffa34a51..02a40a59eb 100644 --- a/docs/Getting-Started-with-Balance-Ball.md +++ b/docs/Getting-Started-with-Balance-Ball.md @@ -84,7 +84,7 @@ The Ball3DAgent subclass defines the following methods: negative reward for dropping the ball. An Agent is also marked as done when it drops the ball so that it will reset with a new ball for the next simulation step. -* agent.Heuristic() - When the `Use Heuristic` checkbox is checked in the Behavior +* agent.Heuristic() - When the `Behavior Type` is set to `Heuristic Only` in the Behavior Parameters of the Agent, the Agent will use the `Heuristic()` method to generate the actions of the Agent. As such, the `Heuristic()` method returns an array of floats. In the case of the Ball 3D Agent, the `Heuristic()` method converts the diff --git a/docs/Learning-Environment-Best-Practices.md b/docs/Learning-Environment-Best-Practices.md index 69b0665581..72ec011056 100644 --- a/docs/Learning-Environment-Best-Practices.md +++ b/docs/Learning-Environment-Best-Practices.md @@ -8,8 +8,9 @@ lessons which progressively increase in difficulty are presented to the agent ([learn more here](Training-Curriculum-Learning.md)). * When possible, it is often helpful to ensure that you can complete the task by - using a heuristic to control the agent. To do so, check the `Use Heuristic` - checkbox on the Agent and implement the `Heuristic()` method on the Agent. + using a heuristic to control the agent. To do so, set the `Behavior Type` + to `Heuristic Only` on the Agent's Behavior Parameters, and implement the + `Heuristic()` method on the Agent. * It is often helpful to make many copies of the agent, and give them the same `Behavior Name`. In this way the learning process can get more feedback information from all of these agents, which helps it train faster. diff --git a/docs/Learning-Environment-Create-New.md b/docs/Learning-Environment-Create-New.md index 7a1144dc60..1a5830e694 100644 --- a/docs/Learning-Environment-Create-New.md +++ b/docs/Learning-Environment-Create-New.md @@ -380,8 +380,8 @@ What this code means is that the heuristic will generate an action corresponding to the values of the "Horizontal" and "Vertical" input axis (which correspond to the keyboard arrow keys). -In order for the Agent to use the Heuristic, You will need to check the `Use Heuristic` -checkbox in the `Behavior Parameters` of the RollerAgent. +In order for the Agent to use the Heuristic, You will need to set the `Behavior Type` +to `Heuristic Only` in the `Behavior Parameters` of the RollerAgent. Press **Play** to run the scene and use the arrows keys to move the Agent around diff --git a/docs/Learning-Environment-Design-Agents.md b/docs/Learning-Environment-Design-Agents.md index 30588fb3b6..f2f49bfe6b 100644 --- a/docs/Learning-Environment-Design-Agents.md +++ b/docs/Learning-Environment-Design-Agents.md @@ -17,10 +17,11 @@ discover the optimal decision-making policy. The Policy class abstracts out the decision making logic from the Agent itself so that you can use the same Policy in multiple Agents. How a Policy makes its decisions depends on the kind of Policy it is. You can change the Policy of an -Agent by changing its `Behavior Parameters`. If you check `Use Heuristic`, the -Agent will use its `Heuristic()` method to make decisions which can allow you to -control the Agent manually or write your own Policy. If the Agent has a `Model` -file, it Policy will use the neural network `Model` to take decisions. +Agent by changing its `Behavior Parameters`. If you set `Behavior Type` to +`Heuristic Only`, the Agent will use its `Heuristic()` method to make decisions +which can allow you to control the Agent manually or write your own Policy. If +the Agent has a `Model` file, it Policy will use the neural network `Model` to +take decisions. ## Decisions diff --git a/docs/Learning-Environment-Examples.md b/docs/Learning-Environment-Examples.md index a71e9eaec4..9a4ce14952 100644 --- a/docs/Learning-Environment-Examples.md +++ b/docs/Learning-Environment-Examples.md @@ -106,8 +106,8 @@ If you would like to contribute environments, please see our * Goal: The agents must hit the ball so that the opponent cannot hit a valid return. * Agents: The environment contains two agent with same Behavior Parameters. - After training you can check the `Use Heuristic` checkbox on one of the Agents - to play against your trained model. + After training you can set the `Behavior Type` to `Heuristic Only` on one of the Agent's + Behavior Parameters to play against your trained model. * Agent Reward Function (independent): * +1.0 To the agent that wins the point. An agent wins a point by preventing the opponent from hitting a valid return.