2021-05-03, 14:00–14:25, PyData Track 1
Label distribution shift is a significant ‘unknown' our models might encounter when facing the real world once they are deployed. In this talk I will provide practical approaches to assist our models to be more robust to such 'unknowns'.
If someone would have told me a year ago that we'd be wearing masks when walking outside and that my daughter's longest time off kindergarten won't be two weeks at August - I'd never believe it! But that's life - things change rapidly, and previously made assumptions might not remain valid.
Many of us, Data Scientists, find ourselves working hard to train a model, deploy it to live environment and then realize the real world does not behave as we expected it. Our model crashes upon a reality that is much different than what it is familiar with. The root cause for this gap is the unexpected changes that impact our domain's population.
In this talk I will focus on a specific type of 'unknown' change - a shift in the label distribution. I will not only present how your model can be more agnostic to 'unknown' changes, but also provide practical approaches you can apply to your model.