Overparameterization and Stability: A New Law of Robustness for Discontinuous Classifiers
A new theoretical study establishes a crucial link between overparameterization, model stability, and generalization for discontinuous classifiers, extending foundational robustness laws beyond smooth function assumptions. The research, detailed in the preprint arXiv:2603.02806v1, introduces quantifiable measures of class stability and co-stability, proving that substantial overparameterization is a necessary condition for achieving stable, high-performing models. This work bridges a significant gap in understanding how modern, highly-parameterized models like deep neural networks generalize, even when their decision boundaries are not smooth.
Defining Stability Through Margin and Generalization Bounds
The core of the analysis hinges on a novel definition of class stability, quantified as the expected distance from a data point to the model's decision boundary in the input domain—effectively a measure of margin. The authors derive a generalization bound for finite function classes that improves inversely with this stability measure. This formally interprets stability as a quantifiable notion of robustness, directly tying a model's geometric property to its expected performance on unseen data.
As a pivotal corollary, the paper establishes a law of robustness for classification that generalizes the seminal results of Bubeck and Sellke. Their earlier work applied primarily to smooth functions, but this new law extends to the broader, more realistic class of discontinuous classifiers. The theorem presents a fundamental trade-off: any model that perfectly fits (interpolates) n data points with approximately n parameters must be unstable. Consequently, achieving high stability necessitates moving into an overparameterized regime with significantly more parameters than data points.
From Finite to Infinite Classes: The Role of Normalized Co-Stability
To analyze parameterized infinite function classes, such as those represented by neural networks, the researchers introduce a stronger, complementary measure: normalized co-stability. This metric is derived from the margin observed in the model's output space (the codomain), rather than the input space. Analyzing this measure allows the team to obtain analogous theoretical results for these complex, practical model families, reinforcing the necessity of overparameterization for robustness.
Experimental validation strongly supports the theoretical framework. The study finds that as model size increases, so does its measured stability, and this increase correlates directly with improved test performance. Notably, the research indicates that traditional norm-based complexity measures, like weight magnitudes, remain largely uninformative for predicting generalization in this context, highlighting the unique explanatory power of the new stability metrics.
Why This Matters for AI and Machine Learning
- Explains Modern Deep Learning Success: Provides a theoretical foundation for why massively overparameterized models, which often interpolate noisy training data, can still generalize well, by linking success to inherent stability and robustness.
- New Tools for Model Analysis: Introduces practical, quantifiable metrics—class stability and normalized co-stability—that offer more direct insight into model robustness than traditional norm-based measures.
- Guides Future Architecture Design: The formal proof that overparameterization is necessary for high stability offers a principled guideline for model scaling and design in pursuit of more reliable and generalizable AI systems.