“Novel approach of collecting and analyzing data from PyTest with Elasticsearch” Noy Nakash, Avi Naftalis · Talk (regular) (25 minutes)
CI/CD is critical for rapid software development, requiring advanced monitoring and logging infrastructure. We will present our PyTest integration with Elasticsearch, leading to significant debug reduction time and infra/product health improvements.
“Python code in an object store, Go fast to production and don't break things” Or Ben-Zeev · Talk (regular) (25 minutes)
Loading python code from a remote location during runtime opens new world of opportunities (and challenges).
“Cleaner SW architecture using python annotations” Yehuda Levian · Talk (regular) (25 minutes)
While many developers struggle with the question, “should or shouldn’t I use python annotations?”. I would demonstrate how proper usage of python annotations guide the developers to refine the structure of the written code.
“Methods for Effective Online Testing in Python” Luka Androjna · Talk (regular) (25 minutes)
In online advertising, we run a lot of online tests to determine which approach boosts our engagement the most. We talk about different ways of online testing through the perspective of a new feature we developed that is based on continuous testing.
“Resumable persistent functions in Python: how to build a chatbot the fun way” Alon Gal · Talk (long - limited number of slots) (45 minutes)
In K we had a simple task: build a chatbot. With a lot of logical paths. And loops. And external interrupts. In this talk I will present our fairly exotic solution, that looks like a resumable function, which is persisted across user requests.
“Genomic data - cost-effective scaling in the cloud” Tal Franji · Talk (regular) (25 minutes)
Genomic sequencing and processing data amounts to many terabytes of data. We'll present how single-cell processing pipe-line requires strong/eventual consistency trade-offs which are different from traditional big-data systems.
“Be a Pythonista: Coding and Life Lessons Learned from Python” Hodaya Stern · Talk (regular) (25 minutes)
Programming requires a logical mindset, which can be used to introduce strategy into your daily life. Join me, as we review pythonic best practices, constructs and concepts and see how to take advantage of them both at and away from the keyboard.
“Python RPC and PubSub over Websockets” Or Weis · Talk (regular) (25 minutes)
Modern distributed software doesn't stop at your VPC. Edge deployed software needs realtime communications, updates, and state sync. It needs RPC and PubSub over the web. Lets make it open-source.
“Python monorepos: what, why and how” Benjy Weinberger · Talk (regular) (25 minutes)
This talk will describe the monorepo codebase architecture, explain why you might want to use it for your Python code, and what kind of tooling you need to work effectively in it.
“Model-Agnostic Interpretation - Beyond Shap and Lime” Nathalie Hauser · Talk (regular) (25 minutes)
I’ll discuss an interpretation framework that allows use of the features’ distribution to understand the direction of the feature’s impact. The concept is derived from ideas formulated in Pearl’s analysis of causality in his book “the book of why”.
“Tutorial: Using Python HoloViz Technologies to Create Interactive Presentations” Jacob Barhak, James A. Bednar · Talk (long - limited number of slots) (45 minutes)
The tutorial will introduce two interactive plots libraries: HoloViews, and panel and show how those can be used to create static html files with interactive graphics
“Fun With Trees! Get to the Root of Song Classification” Yama Anin Aminof · Talk (regular) (25 minutes)
Join this session to hear about my journey with tree-based classifiers, while tackling the problem of classifying songs into different genres. Learn how XGBoost works and what makes it so popular.
“OWASP Top 10 in 20: Application Security for the average Pythonista” Ronnie Sheer · Talk (regular) (25 minutes)
This talk might give you what you need to secure your python application from OWASP top 10 vulnerabilities. We’ll look at examples, tools and quick tips for a more robust code base.
“Harnessing Data to Improve Healthcare” Noa Lubin · Talk (regular) (25 minutes)
In this talk, you will learn how at Diagnostic Robotics we create insights from claims data, a form of administrative data at large scale, which provides a great opportunity for AI in healthcare. You will understand how we use medical code embeddings
“Practical Advice for Using Mypy” Haki Benita · Talk (regular) (25 minutes)
In recent year Mypy gained wide spread adoption, and as it continues to improve and evolve, more and more useful features are being added.
In this talk I'll preset some gems in the type system you can use to make your code better and safer!
“Testing stochastic AI models with hypothesis” Marina Shvartz · Talk (regular) (25 minutes)
In this talk, I will cover shortly the theory of property-based testing and then jump into use cases and live examples to demonstrate the hypothesis library and how we used it to generate random examples of plausible edge cases of our AI model.
“String Comparison In Real Life - Challenges and Various Ways to Resolve Them” Naomi Kriger · Talk (regular) (25 minutes)
Text analysis in real life can often yield unsatisfactory results due to typos, alternate phrasing, abbreviations and more. In this talk, we'll cover practical and efficient string comparison methods, as well as tackle some commonly faced issues.
“When is an exception not an exception? Using warnings in Python” Reuven Lerner · Talk (regular) (25 minutes)
Python's warnings are exceptions — but they're also distinct from exceptions, and are both used and trapped differently. In this talk, I'll introduce warnings, how to raise, trap, and redirect them, and show you best practices for their use.
“FastAPI: The most modern Python3 web framework” Chai Tadmor · Talk (regular) (25 minutes)
FastAPI is a modern, high-performance, batteries-included Python web framework that's perfect for building RESTful APIs. It can handle both synchronous and asynchronous requests
“Building a Secure Python Cloud Application from scratch” David Melamed · Talk (regular) (25 minutes)
When is it the right time to implement security when building an app? In this talk, you will learn how to build from scratch a secure Python application hosted in the cloud, the major attack vectors and tools you need to remediate to the main risks.
“Serverless Python” Nikolay Grishchenko · Talk (regular) (25 minutes)
The time has come for almost every Python developer to build new applications following the serverless paradigm. This talk is 300 level describing the most important principles of serverless application architecture.
“Practical Optimisation for Pandas” Eyal Trabelsi · Talk (regular) (25 minutes)
This talk will review some of the most common pitfalls that can cause otherwise perfectly good Pandas code to grind to be too slow for any time-sensitive applications, and walk through a set of tips and tricks to avoid them.
“Natural language grounding - the next frontier” Uri Goren · Talk (regular) (25 minutes)
We all heard about huge transformers that cost millions of dollars to train, and achieve amazing results. But is there still room for the little guy, with a single GPU and a small budget to innovate in NLP ?
Well, have you heard about grounding ?
“Automatic Curation of Test sets” Jonathan Laserson · Talk (regular) (25 minutes)
Test sets are often designed to have a specific composition of cases, with constraints applied to each sub-population. Treating test-set curation as an optimization problem could save precious time and transition us towards a "data as code" paradigm.
“Set your EDA on Autopilot” Nir Barazida · Talk (regular) (25 minutes)
This session will focus on one of the hottest topics of the past two years in the data science ecosystem - Automated Exploratory Data Analysis.
“Lock and roll - Advanced locks debugging with GDB” Roee Drucker · Talk (regular) (25 minutes)
In this talk we'll use a real-life use case to learn how extending GDB with Python can help us to solve bugs, all while digging deep into the internals of Python locks and how they're implemented.
“Versioning 1.0.1” Pavel Rogovoy, Ron Shilo · Talk (regular) (25 minutes)
We introduce a very useful tool called "vmn" for auto increasing your application's version number in an agnostic way to language or architecture. You will learn how to use vmn for your application and how to integrate it to existing CI/CD procedures
“Avoiding memory leaks with "weakref"” Reuven Lerner · Talk (regular) (25 minutes)
In Python, we normally don't worry about memory usage. But that doesn't mean memory leaks are impossible! In this talk, I'll introduce "weak references" -- how they work, when you would use them, and tricks to get the most out of them.
“Geographic Data - an Introductory Tale” Adam Kariv · Talk (regular) (25 minutes)
An introduction to Geographic Data, some of its basic concepts and common Python tools for working with it.
“A Feature Store - what is it good for?” Orr Shilon · Talk (regular) (25 minutes)
It’s good for feature reuse in machine learning, thereby increasing data science accuracy, velocity, and visibility.
“Python bottleneck optimization - progression from lists to cupy arrays” Yair beer · Talk (regular) (25 minutes)
In optimization problems speed is important, but unfortunately python isn't optimized to speed. In this talk I'll show how to use python and optimize bottleneck functions to be as fast as possible using different libraries and methods.
“"Clippy" for Python - Let's build a real-time code companion by hooking over any function.” Dean Langsam · Talk (long - limited number of slots) (45 minutes)
Python can do so much, including using python to change python behavior. In this talk, we will see how we can hook over any function in order to create an online “helper” in the style of Clippy for the old Office software. This can be useful for refe
“Malware Representation Using Graphs” Gal Braun · Talk (regular) (25 minutes)
In the world of malware detection, we need to keep innovating all the time to catch the latest APTs. Let's see how can we do it with recent developments in graph analysis using neural networks
“Beyond Time Complexity – NumPy, Pandas and vanilla python optimization tricks you must try” Oren Matar · Talk (regular) (25 minutes)
Beyond basic algorithmic considerations when writing our code, you would be surprised how easy it is to get more than 100X increase in efficiency with less than 30 minutes of work without even improving the time complexity.
“How to Test Microservices” Lior Segev · Talk (regular) (25 minutes)
I'll present a tiered approach that allows testing microservices quickly and thoroughly. The tests use stateful mocks of other services, and thus allow concise tests as well as simulating outages, subtle timing problems and large datasets.
“Python’s Frontend - Not what you would think” Bat-El Ziony Sabati · Talk (regular) (25 minutes)
Most programmers consider Python as a scripting or a server side language totally unsuitable for UI . In Imubit we decided to use Jupyter Lab in order to combine Python's powerful server side abilities with a beautiful UI.
“mapping and analysis of geospatial big data using geemap and Google Earth Engine” yaron Michl · Talk (regular) (25 minutes)
Google Earth Engine (GEE) is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets. It enables to analysis and visualizes changes on the Earth’s surface using python API,
“What’s Everyone Talking About? Discovering Topics with Sentence-BERT” Stav Shemesh · Talk (regular) (25 minutes)
Topic Modeling’s objective is to understand and extract the hidden topics from large volumes of text. Using a technique based on Sentence-BERT, we were able to perform the extraction of meaningful topics, and present some evaluation approaches.
“Go Beyond Mock: on Mocks, Stubs and Fakes” Peter Kogan · Talk (regular) (25 minutes)
Developers use the term "mocks" and "mocking" when referring to several different testing practices.
The talk will standardize the terminology of Mocks, Stubs, and Fakes: their capabilities, the differences between them, and when to use each one.
“Leaving Celery in the Dust: How to truly scale in production” Gal Ben David · Talk (regular) (25 minutes)
When we first developed our system, we picked Celery due to its wide community adoption. When we started scaling our systems, we realized Celery was pulling us back from many different angles. We decided to replace Celery with our own technology.
“Opening the black box – an interpretable neural network architecture” Oren Matar · Talk (regular) (25 minutes)
Neural networks don’t have to be black boxes, if you use creative designs and match the architecture to your specific needs, you can create a network as interpretable as linear regression, but without its linear constraints.
“Liberate your API: Building a task manager inside Sanic” Adam Hopkins · Talk (regular) (25 minutes)
An inside look at some of the tools inside Sanic to help build a background task manager.
“War stories: when data doesn't (easily) fit in memory” Uri Yanover · Talk (long - limited number of slots) (45 minutes)
I plan to discuss three archetypical war-stories about fitting in memory. In each of them, I'll describe both the technical challenge and the human biases that needed to be overcome to arrive at sound solutions.
“Deep Learning, Minus the Boilerplate with PyTorch Lightning” Ari Bornstein · Talk (regular) (25 minutes)
This talk introduces PyTorch Lightning, outline its core design philosophy, and provides inline examples of how this philosophy enables more reproducible and production-capable deep learning code.
“Hello Airflow, farewell Cron jobs” Noam Elfanbaum · Talk (regular) (25 minutes)
At Bluevine we use Airflow to drive our all "offline" processing. In this talk, I'll present the challenges and opportunities we had by transitioning from servers running Python scripts with cron to a full blown Airflow setup.
“Python - the golden bridge between algorithm and software development” Meir Vengrover · Talk (regular) (25 minutes)
Software and algorithm teams have different needs, still, python can become a common language and satisfy the needs of both teams.
We will see how python as a common language boosts our development process.
“Prepare for the Unknown - Adjust Your Model to Label Distribution Shifts” Nofar Betzalel · Talk (regular) (25 minutes)
Label distribution shift is a significant ‘unknown' our models might encounter when facing the real world once they are deployed. In this talk I will provide practical approaches to assist our models to be more robust to such 'unknowns'.
“Short Text in the Wild” Gal Hochma · Talk (regular) (25 minutes)
While most of our online lives revolve around short texts, there's very little information on how to apply NLP techniques on such texts. In this talk, I'll share the lessons we learned and the methodology we developed when dealing with short texts.
“Causality in Python” Hanan Shteingart · Talk (long - limited number of slots) (45 minutes)
Most data scientists are focused on predictive (aka supervised) projects, yet the real growth is usually in the estimation of action effects and optimizations of action policies. To this end, I will present causal inference and related packages.
“Enabling Super Fast DS Research using AutoML” Assaf Klein, Hila Weisman-Zohar · Talk (regular) (25 minutes)
AutoML is a python driven tool we built in Outbrain Recommendations group. In this talk we'll share motivation for creating this tool, describe the general architecture and do a live short demo.
“Application State Management” Rachel Chocron · Talk (regular) (25 minutes)
JupyterLab does not have state management as do other commonly used frontend frameworks. This is needed to create multi-page applications with connected forms and shared data. We solved this by developing a a custom solution, which we will present.
“WebSockets and Flask for the real world” Yael Green · Talk (regular) (25 minutes)
Implementing a Flask realtime web application for production isn’t as easy as it seems.
Learn how to use Redis Pub/Sub, Ngnix, uWSGI, signaling, unix socket, mule process, socket.io and more to create a robust realtime app.
“Cutting the Right Corners: Handling High Cardinality by Understanding Your Data” Asaf Sarid · Talk (regular) (25 minutes)
Handling high cardinality with big data can be challenging. We improved our pipeline speed and stability by understanding which data matters more and creating a smart “Cardinality Protector” to reduce cardinality with minimal effect on the data.
“Reprogramming immunity with AI and single-cell multiomics” Drausin Wulsin · Talk (regular) (25 minutes)
Immunai has built one of the largest centralized immune single-cell data assets in the world and is using AI with it to expand the boundary of our understanding of core immune biology and how it translates to the clinical setting.