AI · 1h ago
How ResNet's Skip Connections Made 100-Layer Networks Trainable
Before 2015, deeper networks performed worse due to vanishing gradients—a 56-layer net had higher training error than a 20-layer one. ResNet introduced skip connections that add the input to a block's output, allowing gradients to flow directly through identity paths. This simple change enabled training of 152-layer networks, winning ImageNet and becoming a standard in modern architectures like Transformers.
Meridian48 take
The elegance of ResNet's fix—a single addition operation—belies its profound impact; it's a reminder that sometimes the most important innovations are the simplest.
Read the full reporting
One "+x" That Made 100-Layer Networks Trainable: ResNet Skip Connections →
DEV Community
deep-learningresnet