Pycon Israel 2021

What’s Everyone Talking About? Discovering Topics with Sentence-BERT
05-03, 11:30–11:55 (Asia/Jerusalem), PyData Track 2

Topic Modeling’s objective is to understand and extract the hidden topics from large volumes of text. Using a technique based on Sentence-BERT, we were able to perform the extraction of meaningful topics, and present some evaluation approaches.


Topic modeling is an information retrieval technique for discovering meaningful and interpretable topics in a collection of documents. It allows us to learn something about a set of documents that is too big to read.

In this talk we will cover how we leverage Sentence-BERT using the NLP Python framework sentence-transformers. It provides an easy method for extracting high quality sentence embeddings in a computationally efficient manner, which lays the basis for our topic modeling algorithm.

We will also be addressing the inherent difficulty of evaluating topic models by introducing measuring metrics and visualizations that aid the process of analyzing complex results.


Session language

English

Target audience

Data Scientists, R&D

I hold a B.Sc in computer science from the Open University, and currently working on my M.Sc in computer science at IDC Herzliya. For the past year and a half I am working at Amenity Analytics as a data scientist, building models to solve a wide range of problems in NLP.