Skip to content

Feature Request : Add GELU activation function #11834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SriRangaTarun opened this issue Dec 10, 2018 · 8 comments
Closed

Feature Request : Add GELU activation function #11834

SriRangaTarun opened this issue Dec 10, 2018 · 8 comments
Labels
type:feature The user is asking for a new feature.

Comments

@SriRangaTarun
Copy link
Contributor

I just realized that keras does not have a GELU activation function in activations.py. I request that it be added, because it has many applications in neural networks.

Note : I'll probably submit a pull request for it.

  • [y] Check that you are up-to-date with the master branch of Keras. You can update with:
    pip install git+git://github.com/keras-team/keras.git --upgrade --no-deps

  • [y] Check that your version of TensorFlow is up-to-date. The installation instructions can be found here.

@Dref360
Copy link
Contributor

Dref360 commented Dec 10, 2018

I don't think this should be merged into Keras.

  • Not widely used
  • Not published yet

Please submit your PR at keras-contrib.

@joefaron
Copy link

joefaron commented May 2, 2019

This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:

from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects

def custom_gelu(x):
    return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))

somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25

@pskrunner14
Copy link

pskrunner14 commented May 27, 2019

GELU activation has started to pick up and it has been published a while ago (2016):
https://arxiv.org/abs/1606.08415

Also been used in OpenAI's GPT-1 and 2 and Google's BERT papers. Would love to see this implemented in Keras activations.

@casper-hansen
Copy link

casper-hansen commented Aug 26, 2019

Code from Google's BERT:

def gelu(x):
    """Gaussian Error Linear Unit.
    This is a smoother version of the RELU.
    Original paper: https://arxiv.org/abs/1606.08415
    Args:
        x: float Tensor to perform activation.
    Returns:
        `x` with the GELU activation applied.
    """
    cdf = 0.5 * (1.0 + tf.tanh(
        (np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3)))))
    return x * cdf

Code from OpenAI's GPT-2:

def gelu(x):
    return 0.5*x*(1+tf.tanh(np.sqrt(2/np.pi)*(x+0.044715*tf.pow(x, 3))))

@casper-hansen
Copy link

This guy uses it, and he clearly knows whats going on..
https://github.com/borisbanushev/stockpredictionai
keras code:

from keras.layers import Activation
from keras.utils.generic_utils import get_custom_objects

def custom_gelu(x):
    return 0.5 * x * (1 + tf.tanh(tf.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))
get_custom_objects().update({'custom_gelu': Activation(custom_gelu)})
fit1.add(Dense(output_dim=1, activation=custom_gelu))

somethings wrong with that custom loss.. getting really strange predictions not going under ~-.25

it's not wrong that you are not getting below -0.25, look at the graph for the function:
image

@bhack
Copy link
Contributor

bhack commented Nov 8, 2019

I know that It start to be very confusing but I need to make a cross Org reference tensorflow/tensorflow#33945

@bhack
Copy link
Contributor

bhack commented Jul 27, 2020

Gelu is in tensorflow tensorflow/tensorflow#41178. You can close this.

@SriRangaTarun
Copy link
Contributor Author

Thank you, @bhack! I will close this issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature The user is asking for a new feature.
Projects
None yet
Development

No branches or pull requests

7 participants