Activation Function Grapher

Visualize and compare common neural network activation functions

Controls
Select functions and adjust parameters

Activation Functions

X-Axis Range

[-5, 5]
Activation Functions Visualization
Compare how different activation functions transform inputs
Function Details
Mathematical formulas and use cases
Sigmoid
Formula:
σ(x) = 1 / (1 + e^(-x))
Derivative:
σ'(x) = σ(x) * (1 - σ(x))
Output Range:
(0, 1)
ReLU
Formula:
f(x) = max(0, x)
Derivative:
f'(x) = 1 if x > 0, else 0
Output Range:
[0, ∞)
Tanh
Formula:
tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))
Derivative:
tanh'(x) = 1 - tanh^2(x)
Output Range:
(-1, 1)
Why Activation Functions Matter

Activation functions introduce non-linearity into neural networks, allowing them to learn complex patterns. Without activation functions, neural networks would be equivalent to linear regression models, regardless of their depth.

Different activation functions have different properties that make them suitable for specific tasks:

  • Sigmoid and Tanh were popular in early neural networks but suffer from the vanishing gradient problem in deep networks.
  • ReLU revolutionized deep learning by addressing the vanishing gradient problem and enabling efficient training of deep networks.
  • Leaky ReLU, ELU, and GELU are modern variants that address specific limitations of ReLU.
  • Softmax (not shown in the grapher) is used in output layers for multi-class classification problems.

Choosing the right activation function can significantly impact model performance, training speed, and convergence.