Skip to content

Convert activation functions to numpower #381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 119 commits into from
Aug 17, 2025

Conversation

SkibidiProduction
Copy link

@SkibidiProduction SkibidiProduction commented Jul 14, 2025

Activation implementations

  • Swapped out custom Tensor code for NumPower APIs across all functions: ReLU, LeakyReLU, ELU, GELU, HardSigmoid, SiLU, Tanh, Sigmoid, Softmax, Softplus, Softsign, ThresholdedReLU, etc.

  • Updated derivative methods to use numpower’s derivative helpers.

Tests

  • Refactored unit tests to assert against numpower outputs.

  • Adjusted tolerances and assertions to match numpower’s numeric behavior.

Documentation

  • Added/updated images under docs/images/activation-functions/ to illustrate each activation curve and its derivative using the new implementations.

  • Cleaned up corresponding markdown to reference the updated diagrams.

Code cleanup

  • Aligned naming conventions and method signatures with numpower’s API.

  • Minor style fixes (whitespace, imports, visibility).

Copy link
Member

@andrewdalpino andrewdalpino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work @apphp and @SkibidiProduction ... I think this is exactly what we need for the first round of integration with NumPower. I had a few questions and comments that may change the outcome of the PR so I'm just going to leave it at that for now until we get that sorted.

Overall, fantastic usage of unit tests and good code quality. I love to see it.

Andrew

$$

## Parameters
| # | Name | Default | Type | Description |
|---|---|---|---|---|
| 1 | alpha | 1.0 | float | The value at which leakage will begin to saturate. Ex. alpha = 1.0 means that the output will never be less than -1.0 when inactivated. |

## Size and Performance
ELU is a simple function and is well-suited for deployment on resource-constrained devices or when working with large neural networks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you come up with these size and performance details? I'm noticing that some differ from my understanding. For example, it is not necessarily true when taken in the context of all activation functions that ELU is a simple function or well-suited for resource constrained devices.

Perhaps it would actually be more confusing to offer this somewhat subjective explanation. In addition, in practice, activation functions have very little impact on the total runtime of the network - so taking the effort here to detail out their performance is somewhat distracting.

How do you feel about dropping this "size and performance" section all together, not being opinionated about individual activation functions, and instead letting the user discover the nuances of each activation function for themselves? However, if there is something truly outstanding about a particular activation functions performance characteristics, then let's make sure to include that in the description of the class. For example, ReLU is outstanding because it is the simplest activation function in the group. Maybe there's another activation function that has an associated kernel that is particularly optimized, etc.

Copy link

@apphp apphp Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #382

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes remove the section, but if there is something unique about a particular functions performance characteristics, we can put that info in the description. What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be removed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #382

*/
public function activate(NDArray $input) : NDArray
{
// Calculate |x|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel that these comments provide enough value to justify their existence. I can understand what is going on clearly given your great usage of variables and naming.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #382

$$

## Parameters
| # | Name | Default | Type | Description |
|---|---|---|---|---|
| 1 | alpha | 1.0 | float | The value at which leakage will begin to saturate. Ex. alpha = 1.0 means that the output will never be less than -1.0 when inactivated. |

## Size and Performance
ELU is a simple function and is well-suited for deployment on resource-constrained devices or when working with large neural networks.
Copy link

@apphp apphp Aug 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #382

Copy link
Member

@andrewdalpino andrewdalpino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good fellas, let's roll!

@SkibidiProduction SkibidiProduction merged commit b981c04 into 3.0 Aug 17, 2025
0 of 6 checks passed
@SkibidiProduction SkibidiProduction deleted the convert-activation-functions-to-numpower branch August 17, 2025 23:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants