PyCon Israel 2022

🇺🇸 🤖 Transformer-based NLP Pipelines with SpaCy v3
06-29, 16:30–16:50 (Asia/Jerusalem), PyData

Transformer-based models have been producing superior results on various NLP tasks. In this talk, we’ll cover the new transformer-based NLP pipelines featured by SpaCy, and how to apply multi-task learning for improved efficiency and accuracy.


The transformer deep learning model adopts the mechanism of self-attention, and serves as a swiss army knife solution to many of the most common language tasks. Combining several language tasks together is commonly referred to as language processing pipeline, and was made popular across the NLP industry using SpaCy, a Python library for language understanding.
In this talk, we’ll cover the undoubtedly most exciting feature of SpaCy v3: the integration of transformer-based pipelines. We’ll learn how to quickly get started with pre-trained transformer models, how to create custom pipeline components, and how to perform multi-task learning and share a single transformer between multiple components. Altogether, we’ll explore the building of efficient NLP pipelines for real-world applications reaching state-of-the-art performance.


Session language –

English

Target audience –

Data Scientists

I hold a M.Sc in Computer Science with an NLP thesis focused on language modeling from Reichman University (IDC Herzliya), and a B.Sc in computer science from the Open University. I work at Amenity Analytics as an NLP research engineer, building models to solve a wide range of problems in NLP.