05-03, 14:00–14:25 (Asia/Jerusalem), PyData Track 1
Label distribution shift is a significant ‘unknown' our models might encounter when facing the real world once they are deployed. In this talk I will provide practical approaches to assist our models to be more robust to such 'unknowns'.
If someone would have told me a year ago that we'd be wearing masks when walking outside and that my daughter's longest time off kindergarten won't be two weeks at August - I'd never believe it! But that's life - things change rapidly, and previously made assumptions might not remain valid.
Many of us, Data Scientists, find ourselves working hard to train a model, deploy it to live environment and then realize the real world does not behave as we expected it. Our model crashes upon a reality that is much different than what it is familiar with. The root cause for this gap is the unexpected changes that impact our domain's population.
In this talk I will focus on a specific type of 'unknown' change - a shift in the label distribution. I will not only present how your model can be more agnostic to 'unknown' changes, but also provide practical approaches you can apply to your model.
English
Target audience –Data Scientists
Nofar is a Principal Data Scientist at PayPal. She develops fraud detection models that are being used in production to make real-time decisions that affect millions of PayPal users daily. She leverages PayPal’s massive amounts of data, in the highly imbalanced fraud domain, to learn user behaviour and make sure PayPal is always ahead of its fraudsters. Nofar also co-hosts PayPal's internal and global Data Science Podcast.
Nofar Holds an M.Sc in Information Systems Engineering with a focus on Machine Learning from Ben-Gurion University, where she researched the field of Proactive Recommender Systems.