LearnCurve
An interactive tool for understanding how neural networks learn. © 2025 Chris Rowen — MIT License
An interactive tool for understanding how neural networks learn. © 2025 Chris Rowen — MIT License
We use machine learning to build equations that serve as predictive models—estimating the next word in a composition or identifying objects in a picture. You have data representing typical inputs and desired outputs, but don't know the pattern. A neural network finds a function that fits the data—like recreating a curve from scattered dots, but for arbitrarily complex functions.
① Create Data — Generate examples from a hidden "recipe."
② Train — Measure error, adjust parameters, repeat.
③ Evaluate — Test on held-out data to verify learning.
Data Recipe — Function that generates data
Training Data — Examples the model learns from
Held-Out Data — Reserved to test real learning
Noise — Random variation (real data is messy)
Step — Update from one example
Epoch — Full pass through training data
Error (E) — Prediction error (lower = better)
Learning Rate (η) — Step size for updates
Activation (σ) — Nonlinear function (ReLU/Sigmoid)
Weights (w) — Multipliers on connections
Biases (bij, b) — Offset added at each neuron
Gradient (∂E/∂w) — Steepest uphill direction
Overfitting — Memorizing instead of learning
Extrapolation — Predicting outside training range
Forward: z = w·x + b → h = σ(z) → y
Error: E = ½(y − t)² where t = target
Backward: Chain rule finds ∂E/∂w
∂[f(g(x))]/∂x = f'(g(x)) · g'(x)
Update: w ← w − η · ∂E/∂w (downhill step)
Simple (SGD) — w ← w − η · ∂E/∂w
Adam — Adapts η per-weight. Usually faster.
📈 Simple: x, x^2, x^3-3*x
🌊 Waves: sin(x), sin(x)+sin(3*x)
📐 Sharp: abs(x), sign(x), floor(x)
🔬 Restrict training range → extrapolation fails
⚡ Compare SGD vs Adam
📊 Increase noise → learning breaks down