PyCon Israel 2023

๐Ÿ‡ฎ๐Ÿ‡ฑ How to kill your PySpark performance with these simple tricks
07-04, 11:30โ€“11:50 (Africa/Cairo), Hall 2 (Ground Floor)

The talk would start with explaining what spark is. what problems it solves, and why you might want to use it. Then I'll describe common anti patterns, especially with the data engineering/science related code. and what you should probably do instead


Pyspark, sparkโ€™s python interface is a potent data processing tool and potentially very high performing. This talk is about PYSpark's strong points and how common anti-patterns abuse and hurt PYSpark applications' performance, forcing you to throw more money and lose many of spark benefits. But there is a better way, using native pyspark tools and patterns that Iโ€™ll present


Session language โ€“

Hebrew

Target audience โ€“

R&D

Other (target audience) โ€“

Data science, data engineers, and big-data practitioners

Software developer. open source aficionado. Cares about software craftsmanship.

Trying to make a difference