07-04, 11:30โ11:50 (Africa/Cairo), Hall 2 (Ground Floor)
The talk would start with explaining what spark is. what problems it solves, and why you might want to use it. Then I'll describe common anti patterns, especially with the data engineering/science related code. and what you should probably do instead
Pyspark, sparkโs python interface is a potent data processing tool and potentially very high performing. This talk is about PYSpark's strong points and how common anti-patterns abuse and hurt PYSpark applications' performance, forcing you to throw more money and lose many of spark benefits. But there is a better way, using native pyspark tools and patterns that Iโll present
Hebrew
Target audience โR&D
Other (target audience) โData science, data engineers, and big-data practitioners
Software developer. open source aficionado. Cares about software craftsmanship.
Trying to make a difference