Skip to content

Commit c52161a

Browse files
authored
Merge pull request #382 from apphp/pr-381-fix-comments
Removed unneeded comment
2 parents 9e4a6b2 + ca0612d commit c52161a

File tree

29 files changed

+5
-147
lines changed

29 files changed

+5
-147
lines changed

docs/neural-network/activation-functions/elu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,6 @@ $$
1616
|---|---|---|---|---|
1717
| 1 | alpha | 1.0 | float | The value at which leakage will begin to saturate. Ex. alpha = 1.0 means that the output will never be less than -1.0 when inactivated. |
1818

19-
## Size and Performance
20-
ELU is a simple function and is well-suited for deployment on resource-constrained devices or when working with large neural networks.
21-
2219
## Plots
2320
<img src="../../images/activation-functions/elu.png" alt="ELU Function" width="500" height="auto">
2421

docs/neural-network/activation-functions/gelu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ $$
1010
## Parameters
1111
This activation function does not have any parameters.
1212

13-
## Size and Performance
14-
GELU is computationally more expensive than simpler activation functions like ReLU due to its use of hyperbolic tangent and exponential calculations. The implementation uses an approximation formula to improve performance, but it still requires more computational resources. Despite this cost, GELU has gained popularity in transformer architectures and other deep learning models due to its favorable properties for training deep networks.
15-
1613
## Plots
1714
<img src="../../images/activation-functions/gelu.png" alt="GELU Function" width="500" height="auto">
1815

docs/neural-network/activation-functions/hard-sigmoid.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ $$
1010
## Parameters
1111
This activation function does not have any parameters.
1212

13-
## Size and Performance
14-
Hard Sigmoid has a minimal memory footprint compared to the standard Sigmoid function, as it uses simple arithmetic operations (multiplication, addition) and comparisons instead of expensive exponential calculations. This makes it particularly well-suited for mobile and embedded applications or when computational resources are limited.
15-
1613
## Plots
1714
<img src="../../images/activation-functions/hard-sigmoid.png" alt="Hard Sigmoid Function" width="500" height="auto">
1815

docs/neural-network/activation-functions/hard-silu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,6 @@ $$
1212
## Parameters
1313
This activation function does not have any parameters.
1414

15-
## Size and Performance
16-
Hard SiLU is designed to be computationally efficient compared to the standard SiLU (Swish) activation function. By using the piecewise linear Hard Sigmoid approximation instead of the standard Sigmoid function, it reduces the computational complexity while maintaining similar functional properties. This makes it particularly suitable for deployment on resource-constrained devices or when working with large neural networks.
17-
1815
## Plots
1916
<img src="../../images/activation-functions/hard-silu.png" alt="Hard SiLU Function" width="500" height="auto">
2017

docs/neural-network/activation-functions/hyperbolic-tangent.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ $$
1010
## Parameters
1111
This activation function does not have any parameters.
1212

13-
## Size and Performance
14-
Hyperbolic Tangent requires more computational resources compared to simpler activation functions like ReLU due to its exponential calculations. While not as computationally efficient as piecewise linear functions, it provides important zero-centered outputs that can be critical for certain network architectures, particularly in recurrent neural networks where gradient flow is important.
15-
1613
## Plots
1714
<img src="../../images/activation-functions/hyperbolic-tangent.png" alt="Hyperbolic Tangent Function" width="500" height="auto">
1815

docs/neural-network/activation-functions/leaky-relu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,6 @@ $$
1616
|---|---|---|---|---|
1717
| 1 | leakage | 0.1 | float | The amount of leakage as a proportion of the input value to allow to pass through when not inactivated. |
1818

19-
## Size and Performance
20-
Leaky ReLU is computationally efficient, requiring only simple comparison operations and multiplication. It has a minimal memory footprint and executes quickly compared to more complex activation functions that use exponential or hyperbolic calculations. The leakage parameter allows for a small gradient when the unit is not active, which helps prevent the "dying ReLU" problem while maintaining the computational efficiency of the standard ReLU function.
21-
2219
## Plots
2320
<img src="../../images/activation-functions/leaky-relu.png" alt="Leaky ReLU Function" width="500" height="auto">
2421

docs/neural-network/activation-functions/relu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,6 @@ $$
1414
## Parameters
1515
This activation function does not have any parameters.
1616

17-
## Size and Performance
18-
ReLU is one of the most computationally efficient activation functions, requiring only a simple comparison operation and conditional assignment. It has minimal memory requirements and executes very quickly compared to activation functions that use exponential or hyperbolic calculations. This efficiency makes ReLU particularly well-suited for deep networks with many layers, where the computational savings compound significantly.
19-
2017
## Plots
2118
<img src="../../images/activation-functions/relu.png" alt="ReLU Function" width="500" height="auto">
2219

docs/neural-network/activation-functions/relu6.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ $$
1010
## Parameters
1111
This activation function does not have any parameters.
1212

13-
## Size and Performance
14-
ReLU6 maintains the computational efficiency of standard ReLU while adding an upper bound check. It requires only simple comparison operations and conditional assignments. The additional upper bound check adds minimal computational overhead compared to standard ReLU, while providing benefits for quantization and numerical stability. This makes ReLU6 particularly well-suited for mobile and embedded applications where model size and computational efficiency are critical.
15-
1613
## Plots
1714
<img src="../../images/activation-functions/relu6.png" alt="ReLU6 Function" width="500" height="auto">
1815

docs/neural-network/activation-functions/selu.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,6 @@ Where the constants are typically:
1818
## Parameters
1919
This actvation function does not have any parameters.
2020

21-
## Size and Performance
22-
SELU is computationally more expensive than simpler activation functions like ReLU due to its use of exponential calculations for negative inputs. However, it offers significant benefits by enabling self-normalization, which can eliminate the need for additional normalization layers like Batch Normalization. This trade-off often results in better overall network performance and potentially simpler network architectures. The self-normalizing property of SELU can lead to faster convergence during training and more stable gradients, which may reduce the total computational cost of training deep networks despite the higher per-activation computational cost.
23-
2421
## Plots
2522
<img src="../../images/activation-functions/selu.png" alt="SELU Function" width="500" height="auto">
2623

docs/neural-network/activation-functions/sigmoid.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,6 @@ $$
1010
## Parameters
1111
This activation function does not have any parameters.
1212

13-
## Size and Performance
14-
Sigmoid is computationally more expensive than simpler activation functions like ReLU due to its use of exponential calculations. It requires computing an exponential term and a division operation for each neuron activation. For deep networks, this computational cost can become significant. Additionally, sigmoid activations can cause the vanishing gradient problem during backpropagation when inputs are large in magnitude, potentially slowing down training. Despite these limitations, sigmoid remains valuable in output layers of networks performing binary classification or when probability interpretations are needed.
15-
1613
## Plots
1714
<img src="../../images/activation-functions/sigmoid.png" alt="Sigmoid Function" width="500" height="auto">
1815

0 commit comments

Comments
 (0)