Sampling approaches to learning from imbalanced datasets: active
learning, cost sensitive learning and beyond.
Sampling approaches to learning from imbalanced datasets: active
learning, cost sensitive learning and beyond.
Naoki Abe
TJ Watson Research Research Center, IBM
Of various approaches to dealing with the issue of imbalanced datasets,
many of them are based on some form of sampling, including the well-known
"under" and "over-sampling" methods. In this talk, we will review and
compare some of these methods, drawing where appropriate upon some recent
progress made on the subject, with colleagues (B. Zadrozny and J.
Langford). Methods covered include some applications of active learning
and cost-sensitive learning.