Activation Function Grapher

Visualize and compare common neural network activation functions

Controls

Select functions and adjust parameters

Activation Functions Visualization

Compare how different activation functions transform inputs

Function Details

Mathematical formulas and use cases

Sigmoid

Formula:

σ(x) = 1 / (1 + e^(-x))

Derivative:

σ'(x) = σ(x) * (1 - σ(x))

Output Range:

(0, 1)

ReLU

Formula:

f(x) = max(0, x)

Derivative:

f'(x) = 1 if x > 0, else 0

Output Range:

[0, ∞)

Tanh

Formula:

tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))

Derivative:

tanh'(x) = 1 - tanh^2(x)

Output Range:

(-1, 1)

Why Activation Functions Matter

Activation functions introduce non-linearity into neural networks, allowing them to learn complex patterns. Without activation functions, neural networks would be equivalent to linear regression models, regardless of their depth.

Different activation functions have different properties that make them suitable for specific tasks:

Sigmoid and Tanh were popular in early neural networks but suffer from the vanishing gradient problem in deep networks.
ReLU revolutionized deep learning by addressing the vanishing gradient problem and enabling efficient training of deep networks.
Leaky ReLU, ELU, and GELU are modern variants that address specific limitations of ReLU.
Softmax (not shown in the grapher) is used in output layers for multi-class classification problems.

Choosing the right activation function can significantly impact model performance, training speed, and convergence.

Activation Function Grapher

Activation Functions

X-Axis Range