Pycon Israel 2021

What’s Everyone Talking About? Discovering Topics with Sentence-BERT
2021-05-03, 11:30–11:55, PyData Track 2

Topic Modeling’s objective is to understand and extract the hidden topics from large volumes of text. Using a technique based on Sentence-BERT, we were able to perform the extraction of meaningful topics, and present some evaluation approaches.


Topic modeling is an information retrieval technique for discovering meaningful and interpretable topics in a collection of documents. It allows us to learn something about a set of documents that is too big to read.

In this talk we will cover how we leverage Sentence-BERT using the NLP Python framework sentence-transformers. It provides an easy method for extracting high quality sentence embeddings in a computationally efficient manner, which lays the basis for our topic modeling algorithm.

We will also be addressing the inherent difficulty of evaluating topic models by introducing measuring metrics and visualizations that aid the process of analyzing complex results.


Session language – English Target audience – Data Scientists, R&D