How Does Mixup Help with Robustness, Generalization, and Calibration?
Speaker: Linjun Zhang, Rutgers University
Time: Tuesday, April 20, 2021, 10:00AM - 11:00AM, EST
Zoom Link: contact
tml.online.seminars@gmail.com
Abstract:
Mixup is a popular data augmentation technique based on taking
convex combinations of pairs of examples and their labels. This simple
technique has been shown to substantially improve the robustness,
generalization, and calibration of the trained model. However, it is not
well-understood why such improvement occurs. In this talk, we provide
theoretical analysis to demonstrate how using Mixup in training helps
model robustness, generalization and calibration. For robustness, we
show that minimizing the Mixup loss corresponds to approximately
minimizing an upper bound of the adversarial loss. This explains why
models obtained by Mixup training exhibits robustness to several kinds
of adversarial attacks such as Fast Gradient Sign Method (FGSM). For
generalization, we prove that Mixup augmentation corresponds to a
specific type of data-adaptive regularization that reduces overfitting.
For calibration, we theoretically prove that Mixup improves calibration
in high-dimensional settings by investigating two natural data models on
classification and regression. Our analysis provides new insights and a
framework to understand Mixup.
Speaker's Bio
Linjun Zhang is an Assistant Professor in the Department of Statistics,
at Rutgers University. He received his Ph.D. in Statistics at the
Wharton School, the University of Pennsylvania in 2019. His current
research interests include machine learning theory, high dimensional
statistics, adversarial robustness, and privacy-preserving data
analysis.
|
|
|