Pycon Israel 2021
Beyond basic algorithmic considerations when writing our code, you would be surprised how easy it is to get more than 100X increase in efficiency with less than 30 minutes of work without even improving the time complexity.
Developers use the term "mocks" and "mocking" when referring to several different testing practices.
The talk will standardize the terminology of Mocks, Stubs, and Fakes: their capabilities, the differences between them, and when to use each one.
In the world of malware detection, we need to keep innovating all the time to catch the latest APTs. Let's see how can we do it with recent developments in graph analysis using neural networks
It’s good for feature reuse in machine learning, thereby increasing data science accuracy, velocity, and visibility.
Software and algorithm teams have different needs, still, python can become a common language and satisfy the needs of both teams.
We will see how python as a common language boosts our development process.
While most of our online lives revolve around short texts, there's very little information on how to apply NLP techniques on such texts. In this talk, I'll share the lessons we learned and the methodology we developed when dealing with short texts.
JupyterLab does not have state management as do other commonly used frontend frameworks. This is needed to create multi-page applications with connected forms and shared data. We solved this by developing a a custom solution, which we will present.
This talk will review some of the most common pitfalls that can cause otherwise perfectly good Pandas code to grind to be too slow for any time-sensitive applications, and walk through a set of tips and tricks to avoid them.
Modern distributed software doesn't stop at your VPC. Edge deployed software needs realtime communications, updates, and state sync. It needs RPC and PubSub over the web. Lets make it open-source.
At Bluevine we use Airflow to drive our all "offline" processing. In this talk, I'll present the challenges and opportunities we had by transitioning from servers running Python scripts with cron to a full blown Airflow setup.
In online advertising, we run a lot of online tests to determine which approach boosts our engagement the most. We talk about different ways of online testing through the perspective of a new feature we developed that is based on continuous testing.
In K we had a simple task: build a chatbot. With a lot of logical paths. And loops. And external interrupts. In this talk I will present our fairly exotic solution, that looks like a resumable function, which is persisted across user requests.
I plan to discuss three archetypical war-stories about fitting in memory. In each of them, I'll describe both the technical challenge and the human biases that needed to be overcome to arrive at sound solutions.
FastAPI is a modern, high-performance, batteries-included Python web framework that's perfect for building RESTful APIs. It can handle both synchronous and asynchronous requests
In this talk, you will learn how at Diagnostic Robotics we create insights from claims data, a form of administrative data at large scale, which provides a great opportunity for AI in healthcare. You will understand how we use medical code embeddings
Loading python code from a remote location during runtime opens new world of opportunities (and challenges).
In Python, we normally don't worry about memory usage. But that doesn't mean memory leaks are impossible! In this talk, I'll introduce "weak references" -- how they work, when you would use them, and tricks to get the most out of them.
Genomic sequencing and processing data amounts to many terabytes of data. We'll present how single-cell processing pipe-line requires strong/eventual consistency trade-offs which are different from traditional big-data systems.
Google Earth Engine (GEE) is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets. It enables to analysis and visualizes changes on the Earth’s surface using python API,
In this talk we'll use a real-life use case to learn how extending GDB with Python can help us to solve bugs, all while digging deep into the internals of Python locks and how they're implemented.
Most programmers consider Python as a scripting or a server side language totally unsuitable for UI . In Imubit we decided to use Jupyter Lab in order to combine Python's powerful server side abilities with a beautiful UI.
In optimization problems speed is important, but unfortunately python isn't optimized to speed. In this talk I'll show how to use python and optimize bottleneck functions to be as fast as possible using different libraries and methods.
In this talk, I will cover shortly the theory of property-based testing and then jump into use cases and live examples to demonstrate the hypothesis library and how we used it to generate random examples of plausible edge cases of our AI model.
We introduce a very useful tool called "vmn" for auto increasing your application's version number in an agnostic way to language or architecture. You will learn how to use vmn for your application and how to integrate it to existing CI/CD procedures
Handling high cardinality with big data can be challenging. We improved our pipeline speed and stability by understanding which data matters more and creating a smart “Cardinality Protector” to reduce cardinality with minimal effect on the data.
The time has come for almost every Python developer to build new applications following the serverless paradigm. This talk is 300 level describing the most important principles of serverless application architecture.
The tutorial will introduce two interactive plots libraries: HoloViews, and panel and show how those can be used to create static html files with interactive graphics
This talk introduces PyTorch Lightning, outline its core design philosophy, and provides inline examples of how this philosophy enables more reproducible and production-capable deep learning code.
An inside look at some of the tools inside Sanic to help build a background task manager.
Python's warnings are exceptions — but they're also distinct from exceptions, and are both used and trapped differently. In this talk, I'll introduce warnings, how to raise, trap, and redirect them, and show you best practices for their use.
When we first developed our system, we picked Celery due to its wide community adoption. When we started scaling our systems, we realized Celery was pulling us back from many different angles. We decided to replace Celery with our own technology.
I’ll discuss an interpretation framework that allows use of the features’ distribution to understand the direction of the feature’s impact. The concept is derived from ideas formulated in Pearl’s analysis of causality in his book “the book of why”.
This session will focus on one of the hottest topics of the past two years in the data science ecosystem - Automated Exploratory Data Analysis.
When is it the right time to implement security when building an app? In this talk, you will learn how to build from scratch a secure Python application hosted in the cloud, the major attack vectors and tools you need to remediate to the main risks.
Join this session to hear about my journey with tree-based classifiers, while tackling the problem of classifying songs into different genres. Learn how XGBoost works and what makes it so popular.
An introduction to Geographic Data, some of its basic concepts and common Python tools for working with it.
Most data scientists are focused on predictive (aka supervised) projects, yet the real growth is usually in the estimation of action effects and optimizations of action policies. To this end, I will present causal inference and related packages.
While many developers struggle with the question, “should or shouldn’t I use python annotations?”. I would demonstrate how proper usage of python annotations guide the developers to refine the structure of the written code.
Topic Modeling’s objective is to understand and extract the hidden topics from large volumes of text. Using a technique based on Sentence-BERT, we were able to perform the extraction of meaningful topics, and present some evaluation approaches.
Programming requires a logical mindset, which can be used to introduce strategy into your daily life. Join me, as we review pythonic best practices, constructs and concepts and see how to take advantage of them both at and away from the keyboard.
I'll present a tiered approach that allows testing microservices quickly and thoroughly. The tests use stateful mocks of other services, and thus allow concise tests as well as simulating outages, subtle timing problems and large datasets.
We all heard about huge transformers that cost millions of dollars to train, and achieve amazing results. But is there still room for the little guy, with a single GPU and a small budget to innovate in NLP ?
Well, have you heard about grounding ?
Label distribution shift is a significant ‘unknown' our models might encounter when facing the real world once they are deployed. In this talk I will provide practical approaches to assist our models to be more robust to such 'unknowns'.
Text analysis in real life can often yield unsatisfactory results due to typos, alternate phrasing, abbreviations and more. In this talk, we'll cover practical and efficient string comparison methods, as well as tackle some commonly faced issues.
Implementing a Flask realtime web application for production isn’t as easy as it seems.
Learn how to use Redis Pub/Sub, Ngnix, uWSGI, signaling, unix socket, mule process, socket.io and more to create a robust realtime app.
CI/CD is critical for rapid software development, requiring advanced monitoring and logging infrastructure. We will present our PyTest integration with Elasticsearch, leading to significant debug reduction time and infra/product health improvements.
Neural networks don’t have to be black boxes, if you use creative designs and match the architecture to your specific needs, you can create a network as interpretable as linear regression, but without its linear constraints.
In recent year Mypy gained wide spread adoption, and as it continues to improve and evolve, more and more useful features are being added.
In this talk I'll preset some gems in the type system you can use to make your code better and safer!
Test sets are often designed to have a specific composition of cases, with constraints applied to each sub-population. Treating test-set curation as an optimization problem could save precious time and transition us towards a "data as code" paradigm.
AutoML is a python driven tool we built in Outbrain Recommendations group. In this talk we'll share motivation for creating this tool, describe the general architecture and do a live short demo.
This talk might give you what you need to secure your python application from OWASP top 10 vulnerabilities. We’ll look at examples, tools and quick tips for a more robust code base.
Python can do so much, including using python to change python behavior. In this talk, we will see how we can hook over any function in order to create an online “helper” in the style of Clippy for the old Office software. This can be useful for refe
This talk will describe the monorepo codebase architecture, explain why you might want to use it for your Python code, and what kind of tooling you need to work effectively in it.
Immunai has built one of the largest centralized immune single-cell data assets in the world and is using AI with it to expand the boundary of our understanding of core immune biology and how it translates to the clinical setting.