{"$schema": "https://c3voc.de/schedule/schema.json", "generator": {"name": "pretalx", "version": "2024.1.0"}, "schedule": {"url": "https://cfp.pycon.org.il/conference2021/schedule/", "version": "0.4", "base_url": "https://cfp.pycon.org.il", "conference": {"acronym": "conference2021", "title": "Pycon Israel 2021", "start": "2021-05-02", "end": "2021-05-03", "daysCount": 2, "timeslot_duration": "00:05", "time_zone_name": "Asia/Jerusalem", "colors": {"primary": "#3aa57c"}, "rooms": [{"name": "General Track 1", "guid": "6173b7fd-2f52-5a5c-954d-d71d3a07c092", "description": null, "capacity": null}, {"name": "General Track 2", "guid": "82e9c2d2-9168-5661-b9a9-c692407b1c85", "description": null, "capacity": null}, {"name": "PyData Track 1", "guid": "f674e2c9-95a7-5b96-9470-593c4d5465a5", "description": null, "capacity": null}, {"name": "PyData Track 2", "guid": "e6514410-d239-5e9d-856c-a0400f63fc2f", "description": null, "capacity": null}], "tracks": [{"name": "General", "color": "#0424FF"}, {"name": "PyData", "color": "#FE4C0A"}, {"name": "PySecurity", "color": "#63F80C"}], "days": [{"index": 1, "date": "2021-05-02", "day_start": "2021-05-02T04:00:00+03:00", "day_end": "2021-05-03T03:59:00+03:00", "rooms": {"General Track 1": [{"url": "https://cfp.pycon.org.il/conference2021/talk/DFMUFY/", "id": 423, "guid": "fb43093a-a8e5-562a-85c2-7e8c748f21f7", "date": "2021-05-02T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-423-beyond-time-complexity-numpy-pandas-and-vanilla-python-optimization-tricks-you-must-try", "title": "Beyond Time Complexity \u2013 NumPy, Pandas and vanilla python optimization tricks you must try", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Beyond basic algorithmic considerations when writing our code, you would be surprised how easy it is to get more than 100X increase in efficiency with less than 30 minutes of work without even improving the time complexity.", "description": "When operating on big arrays we often fall into old habits of code writing, be it using pandas, numpy or vanilla python. While these habits may optimize the speed at which we write code, they often fall short of the optimal code for run-time. Even saving milliseconds of run time per task can accumulate to staggering amounts. Sometimes despite having very similar syntax between functions and packages there is a huge difference in performance, since the internal workings of pandas, numpy and python varies, as each balances the overhead (or \u201cinit\u201d) and marginal cost differently.\r\nWe will explore common and run-time costly pitfalls when using pandas and numpy and we will see when it is more efficient to use vanilla python compared to these packages.\r\nI will introduce a profiling method and a timing method. Working with both together can help us detect the weakest points in our code, and quickly test different options for improving it. One of the main points of the talk is how to come up with many code variations and test them quickly to come up with the best solution.\r\nI will present many often-neglected functions from these packages or native, and experiment to see when each is more efficient. E.g. using pandas index vs dict and itemgetter; np, pd or py isin methods; apply vs map; concatenating and appending data to arrays; and many more.\r\nIn addition, we will learn of some useful and surprising efficiency tricks and data-structures like sparse matrix, numpy array to replace a dict, clever ways of using memorization and more.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "f4f5da81-b734-50bd-847e-c4a34aeff661", "id": 245, "code": "YUMT7A", "public_name": "Oren Matar", "avatar": null, "biography": "A senior data scientist with interest in Bayesian methods, novel NN architectures and run-time optimization tricks in python. Specializing in time series forecasting particularly in the field of supply chain forecasting.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/ZLCZ9V/", "id": 481, "guid": "d70d7975-6eae-5972-a9c6-cbfa3ccbf9f1", "date": "2021-05-02T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-481-python-the-golden-bridge-between-algorithm-and-software-development", "title": "Python - the golden bridge between algorithm and software development", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Software and algorithm teams have different needs, still, python can become a common language and satisfy the needs of both teams.\r\nWe will see how python as a common language boosts our development process.", "description": "Can algorithm and software teams talk the same language?\r\nFor many years in Applied Materials, the answer was \u2013 No! the teams have different needs...\r\nWhile algorithm team preferred to develop their algorithms in MATLAB which has rich arsenal of scientific tools, software team preferred to write their code in C++ in order to gain code efficiency.\r\nThe code conversion between the teams was long and exhausting process.\r\nIn this lecture I will describe our decision to have python as a common language for both algorithm and software. \r\nI will explain how python can fulfill the needs for both algorithm development and software standards of design and efficiency.\r\nI will describe our joined development process in order to achieve algorithm and software goals through the language.    \r\nI will show you the boost this process gave us which will convince you to choose python as a common language in your company too!", "recording_license": "", "do_not_record": false, "persons": [{"guid": "24948b1c-5702-5da8-9844-5f0ffcc73221", "id": 370, "code": "KUZZZA", "public_name": "Meir Vengrover", "avatar": "https://cfp.pycon.org.il/media/Me.JPG", "biography": "Team leader of image processing software group in Applied Materials", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/VAGUJY/", "id": 508, "guid": "27be0e69-1f95-526e-a8b7-3d642a44558a", "date": "2021-05-02T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-508-application-state-management", "title": "Application State Management", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "JupyterLab does not have state management as do other commonly used frontend frameworks. This is needed to create multi-page applications with connected forms and shared data. We solved this by developing a a custom solution, which we will present.", "description": "We implemented a multi-pages python application over JupyterLab in order to utilize Jupyter\u2019s data visualization and UI capabilities. Our application requires the various pages to be aware of each other's data and get updates when it\u2019s changed. We could not find a simple package for state management in python as other commonly used front-end frameworks have. Therefore, we implemented a simple state management package. \r\n\r\nOur state has a dictionary interface, which enables the application pages to:\r\n* insert keys with any type of data,\r\n* register actions to change these keys\r\n* register to receive updates when a specific key is changed\r\n\r\nWe will share our state management code and show live code examples of how to use it in python applications.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "168f5fc6-2e9c-5b4f-9668-88b8b02faa56", "id": 377, "code": "K8XL8E", "public_name": "Rachel Chocron", "avatar": "https://cfp.pycon.org.il/media/rachel.jpeg", "biography": "Experienced Software Engineer. Skilled in Software Design, Python and Node JS.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/XQ3JJQ/", "id": 401, "guid": "062afd2b-c676-5618-a701-f442cecd6ece", "date": "2021-05-02T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:45", "room": "General Track 1", "slug": "conference2021-401-resumable-persistent-functions-in-python-how-to-build-a-chatbot-the-fun-way", "title": "Resumable persistent functions in Python: how to build a chatbot the fun way", "subtitle": "", "track": "General", "type": "Talk (long - limited number of slots)", "language": "en", "abstract": "In K we had a simple task: build a chatbot. With a lot of logical paths. And loops. And external interrupts. In this talk I will present our fairly exotic solution, that looks like a resumable function, which is persisted across user requests.", "description": "Source code for the examples is [available on GitHub](https://github.com/khealth/dialogs).\r\n\r\nSlides are available [here](https://docs.google.com/presentation/d/1B1dr9g_BgLVfR_CwUSNG7mhu6E4_hmlE_RJP8IPaCIE/edit?usp=sharing).\r\n\r\n---\r\n\r\nI will be presenting the \"dialogs framework\", which a library we build in-house for managing long-running persistent functions. These functions come back to life when they get a message from the user, magically resuming where they left off. I will go over the design process, the Python implementation details, and the current gaps and challenges. \r\n\r\nThe aim of this talk is to inspire similar projects, show off our framework, and reach out for community inspiration on our open challenges.\r\n\r\nAs motivation, here is the kind of code we are writing:\r\n\r\n<pre><code>@dialog\r\ndef greet():\r\n    name = run(prompt(\"Hi, I'm a bot. What's your name?\"))\r\n    location = run(prompt(f\"{name} is a beautiful name! Where are you from?\"))\r\n    run(send(f\"{location}? No kidding! I grew up there!\"))</code></pre>\r\n\r\nAfter each call to `run`, the function terminates and the user request is answered with the next message to display. The execution state is persisted into our database, allowing the follow-up answer to be handled by a different server in the future.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "fb31a182-d367-59a8-b243-1c5b29143f58", "id": 318, "code": "JKNJFW", "public_name": "Alon Gal", "avatar": "https://cfp.pycon.org.il/media/GK9A4800.JPG", "biography": "Software Engineer at K Health, on the chatbot team.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/ZCB8NV/", "id": 511, "guid": "c1a61f7c-b33b-526b-a90d-07aad25a4f09", "date": "2021-05-02T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-511-python-code-in-an-object-store-go-fast-to-production-and-don-t-break-things", "title": "Python code in an object store, Go fast to production and don't break things", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Loading python code from a remote location during runtime opens new world of opportunities (and challenges).", "description": "As we all know, python is a very versatile language. You can do a lot with python, from \u2018scripting\u2019 to \u2018software development\u2019, it can be interpreted and also compiled as most of us know .pyc files and some of us know .pyd files.\r\n\r\nAnother thing that can be done using python is \u2018webImport\u2019 or \u2018http import\u2019, it means importing a python module from the web. Using code that comes from a remote source can open us a new domain of possibilities, and a totally new domain of problems, risks and challenges.\r\n\r\n \r\nIn JPMorgan we have the Athena project, it has a huge python code base, probably the largest in the world. The athena \u2018special\u2019 python interpreter imports code from an object store and enables us to do many things:\r\n\r\nPros:\r\n\u00b7         A very fast way to update many services out there, as you just need to update the \u201cpublic location\u201d\r\n\u00b7         You do not have to create \u2018large updates\u2019 and push all the changes coming from different teams to production, if one developer is done, he can just push his own code to production without waiting for the \u2018release time\u2019. Even though the code base is large, updates are very fast\r\n\r\n\u00b7         Code could be updated after the artifact\\container is \u2018sealed\u2019\r\n\u00b7         The code base\\program can be very large, ignoring the size of the artifact\r\n\u00b7         Fixes and pushes to production can be very fast\r\n\u00b7         New tests can check the code computability of both the coming and the previous releases (for an instance, on git, if you store the tests in the same repo as the code, it will be very hard for you to write a test case, show that\u2019s is failing on the released version, and prove that it passes on the fix you made)\r\n\r\nCons:\r\n\u00b7         Requires investment\r\n\u00b7         Loading modules can sometimes be slower than loading from disk\r\n\u00b7         Need to find a solution for code inconsistencies\r\n\u00b7         Code could be updated after the binaries\\container is \u2018sealed\u2019 (this might change the behavior)\r\n\u00b7         Requires good testing it order to be stable\r\n\r\n\r\nOn my talk ill discuss this and maybe more", "recording_license": "", "do_not_record": false, "persons": [{"guid": "962c2f00-0403-58dd-84e4-4c935f7b52c9", "id": 305, "code": "SNE8DE", "public_name": "Or Ben-Zeev", "avatar": "https://cfp.pycon.org.il/media/IMG-20180613-WA0033.jpg", "biography": "Studied computer science at IDC\r\nWorks at JP Morgan for 4 Years", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/GWDAHX/", "id": 505, "guid": "5c983312-1330-5867-bc70-a7910c69de79", "date": "2021-05-02T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-505-avoiding-memory-leaks-with-weakref-", "title": "Avoiding memory leaks with \"weakref\"", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "In Python, we normally don't worry about memory usage. But that doesn't mean memory leaks are impossible! In this talk, I'll introduce \"weak references\" -- how they work, when you would use them, and tricks to get the most out of them.", "description": "One of the great things about Python is that it includes garbage collection. You don't have to allocate or free memory; just let the system take care of things on its own! In theory, that means you can never experience memory leaks. But in practice, that's not quite true: There are definitely ways in which you can accidentally hold onto object references, resulting in a memory leak.\r\n\r\nFortunately, Python provides us with \"weak references\" in the standard library's \"weakref\" module.  In this talk, I'll describe Python's garbage collector, and how we can end up with memory problems despite it. I'll then show you how the \"weakref\" module can help us -- both on its own, and with the data structures and functionality that the \"weakref\" module provides.\r\n\r\nEven if you don't need weak references, knowing how they work can give you great insight into Python's internals, and how you can take advantage of them in your work.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "857807f5-4dc9-5ecb-b8cd-94148db64a13", "id": 219, "code": "9M8KZ3", "public_name": "Reuven Lerner", "avatar": "https://cfp.pycon.org.il/media/Reuven.Lerner.jpeg", "biography": "Reuven Lerner is a full-time Python trainer. In a given year, he teaches courses at companies in the United States, Europe, Israel, India, and China \u2014 as well as to people around the world, via his online courses, including Weekly Python Exercise.\r\n\r\nReuven\u2019s most recent book is \u201cPython Workout,\u201d a collection of Python exercises with extensive explanations, published by Manning.  He is currently working on \"Pandas Workout,\" a similar collection of exercises for Pandas.\r\n\r\nReuven\u2019s free, weekly Better developers newsletter, about Python and software engineering, is read by more than 20,000 developers around the globe. His \u201cTrainer weekly\u201d newsletter is similarly popular among people who give corporate training.\r\n\r\nReuven has a bachelor\u2019s degree in computer science and engineering from MIT, and a PhD in learning sciences from Northwestern University. He lives in Modi\u2019in, Israel with his wife and three children.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/CZLEYQ/", "id": 503, "guid": "27a08fce-3b4b-592f-8ab2-7b61032d4274", "date": "2021-05-02T14:30:00+03:00", "start": "14:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-503-python-s-frontend-not-what-you-would-think", "title": "Python\u2019s Frontend - Not what you would think", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Most programmers consider Python as a scripting or a server side language totally unsuitable for UI . In Imubit we decided to use Jupyter Lab in order to combine Python's powerful server side abilities with a beautiful UI.", "description": "While Jupyter is widely used for big data or data science, we decided to use it to easily develop a streamlined work process for our engineers. With just a small amount of effort, we were able to create beautiful, easy to use Python user facing applications for non-technical users.\r\nWe will start by describing Jupyter Lab Extensions, then we will get a glimpse of some of Python\u2019s frontend packages (Ipywidgets, panel, Ipyaggrid etc.), and learn how to use them and expand them with our own custom logic.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "75068745-7db0-5988-9849-cbe998565280", "id": 380, "code": "HH99ZJ", "public_name": "Bat-El Ziony Sabati", "avatar": "https://cfp.pycon.org.il/media/batel-ziony.jpg", "biography": "Full stack engineer at Imubit - 2020-Current\r\n\r\nFull stack engineer at the Cyber division of the Prime Minister's Office - 2016-2019\r\n\r\nNational Service as a full stack engineer at the Cyber division of the Prime Minister's Office 2014-2016\r\n\r\nMA in Sociology and Anthropology with Thesis at Bar Ilan University - 2017-2019\r\n\r\nBSc in Computer Science and The degree was done in parallel to my high school studies at The Jerusalem College of Technology (JCT) \u2013 Lev Academic Center - 2009-2014", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/J7KQBY/", "id": 380, "guid": "7dce17d5-de0e-5fd4-9f1e-5779f78dc110", "date": "2021-05-02T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-380-versioning-1-0-1", "title": "Versioning 1.0.1", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "We introduce a very useful tool called \"vmn\" for auto increasing your application's version number in an agnostic way to language or architecture. You will learn how to use vmn for your application and how to integrate it to existing CI/CD procedures", "description": "Link:\r\nhttps://github.com/final-israel/vmn\r\n\r\nProblem statement:\r\nToday there is no standard way for increasing application's version, retrieving it or going to a specific \r\nversion with the exact dependency it was once released with. These are the issues vmn tries to solve. In this talk we will include different real world use cases and will invite others to collaborate and get involved in the project's development.\r\n\r\nWalk aways:\r\nThe attendees will learn how to stamp their current applications with vmn", "recording_license": "", "do_not_record": false, "persons": [{"guid": "81be98b1-9473-5fea-a123-445996fbcf38", "id": 28, "code": "ER8NKR", "public_name": "Pavel Rogovoy", "avatar": null, "biography": "R&D teams leading and management, systems architecture and engineering, embedded/RT systems, SW architecture and protocols design.\r\n\r\nExpert in computing performance and architecture: HPC/HTC clusters architecture, operating systems, applications, storage, on-prem solutions and cloud. \r\n\r\nI enjoy contributing to open source", "answers": []}, {"guid": "6d64837c-efe1-5b79-a4fa-2286815f4e84", "id": 395, "code": "HUXQYY", "public_name": "Ron Shilo", "avatar": "https://cfp.pycon.org.il/media/Screen_Shot_2021-04-20_at_18.24.57.jpg", "biography": "Python developer at Final specializing in the creation of cloud infrastructure such as open-stack and high-performance hybrid cloud/bare metal infrastructures.\r\nProficient in #Python3 #Linux #Ubuntu #Bash #Ansible #Openstack #Docker #Jenkins #Git", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/MDPMJT/", "id": 454, "guid": "2879d393-25b2-5b76-8748-3326e93af756", "date": "2021-05-02T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-454-serverless-python", "title": "Serverless Python", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "The time has come for almost every Python developer to build new applications following the serverless paradigm. This talk is 300 level describing the most important principles of serverless application architecture.", "description": "We shall learn the main benefits of micro-micro services as well as the main challenges when building this kind of applications. As a bonus: some ways to deal with these challenges and several common serverless architecture patterns.\r\n\r\nAgenda\r\n- Why Serverless? (15 min)\r\n- Limits. Why they exist and how to fit them. (5 min)\r\n- How to orchestrate. (5 min)\r\n\r\nSpeaker\r\nNikolay Grishchenko. 20+ years programming, 7+ years in Python, 5+ years with serverless applications.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "7bed877c-54b5-53ee-a788-282dd547c06c", "id": 354, "code": "BM3NZD", "public_name": "Nikolay Grishchenko", "avatar": "https://cfp.pycon.org.il/media/nikolay_300.jpg", "biography": "Nikolay Grishchenko. 20+ years programming, 7+ years in Python, 5+ years with serverless applications.", "answers": []}], "links": [], "attachments": [], "answers": []}], "General Track 2": [{"url": "https://cfp.pycon.org.il/conference2021/talk/ARN7AA/", "id": 465, "guid": "7dfaf8c8-f675-5ff0-9e4b-412976ecc5f1", "date": "2021-05-02T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-465-go-beyond-mock-on-mocks-stubs-and-fakes", "title": "Go Beyond Mock: on Mocks, Stubs and Fakes", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Developers use the term \"mocks\" and \"mocking\" when referring to several different testing practices. \r\nThe talk will standardize the terminology of Mocks, Stubs, and Fakes: their capabilities, the differences between them, and when to use each one.", "description": "We'll start with defining Mocks, Stubs, and Fakes: their capabilities, the differences between them, and how to code them.\r\n\r\nNext, I'll show an iterative code example where we mock the same code using the above three methods. Every time we mock the same code differently, we get to test different things.\r\n\r\nThe talk summary would be based on the code examples: covering every mock's strengths and weaknesses, when is it appropriate to use each type of testing method, and a general wrap-up.\r\n\r\nIf you want to test your code effectively, this talk is definitely for you!", "recording_license": "", "do_not_record": false, "persons": [{"guid": "52f57082-d535-5c2e-8033-e65667586ecc", "id": 66, "code": "GBLMXG", "public_name": "Peter Kogan", "avatar": "https://cfp.pycon.org.il/media/pic_crop_2.jpg", "biography": "Started programming around 20 years ago, back at high school. My first programming language was C, and I still remember it fondly.\r\n\r\nOver the years, I was lucky enough to use several programming languages and be a part of many code projects that made their way to production.\r\n\r\nIn the last few years, my focus has been on backend development using Python. Python is by far the most enjoyable programming language I know, give me a few minutes, and maybe I can make a believer of you :)\r\n\r\nA constant learner and always looking to connect to interesting people.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/VXF887/", "id": 406, "guid": "16b11b10-0d15-57f7-98ce-d79019abe3b1", "date": "2021-05-02T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-406-python-rpc-and-pubsub-over-websockets", "title": "Python RPC and PubSub over Websockets", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Modern distributed software doesn't stop at your VPC. Edge deployed software needs realtime communications, updates, and state sync. It needs RPC and PubSub over the web. Lets make it open-source.", "description": "Modern distributed software doesn't stop at your VPC. Edge deployed software needs realtime communications, updates, and state sync. It needs RPC and PubSub over the web. Lets make it open-source.\r\n\r\nIn this talk we'll cover the need for over-the-web realtime RPC and PubSub, why we needed and created it for our OpenPolicyAgent  realtime updates layer, along side:\r\n- The challenges that the implementation faced\r\n- Pro/Cons of realtime update channels  \r\n- Common use cases (updates, sync, event propagation, distributed computing, authorization, ...)\r\n- Additional awesome Python open-source we used in this solution (FastApi, Tenacity, broadcaster, ...)\r\n- How to use the open-source packages we shared.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "9db618ad-e63d-55ee-a6c4-2169376b0a9c", "id": 322, "code": "PGYBP8", "public_name": "Or Weis", "avatar": null, "biography": "Developer and Entrepreneur.\r\nRich R&D and cybersec experience.\r\nFounder and ex-CEO at Rookout.com\r\nFounder and CEO at Authorizon.com", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/33JELR/", "id": 482, "guid": "cb36f8ec-5578-522b-bf54-ea9b3a3f69ca", "date": "2021-05-02T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-482-hello-airflow-farewell-cron-jobs", "title": "Hello Airflow, farewell Cron jobs", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "At Bluevine we use Airflow to drive our all \"offline\" processing. In this talk, I'll present the challenges and opportunities we had by transitioning from servers running Python scripts with cron to a full blown Airflow setup.", "description": "In Bluevine, we were looking to upgrade our backend processing infrastructure from servers running Python scripts with Cron to a more scalable solution that allows for workflows (DAGs) and better observability of the application state. Airflow proved to be a valuable tool, though not without some sharp edges. Some of the points that I'll cover are:\r\n\r\n- Supporting multiple Python versions\r\n- Event driven DAGs\r\n- Airflow Performance issues and how we circumvented them\r\n- Building Airflow plugins to enhance observability\r\n- Monitoring Airflow using Grafana\r\n- CI for Airflow DAGs (super useful!)\r\n- Patching Airflow scheduler", "recording_license": "", "do_not_record": false, "persons": [{"guid": "3693d51f-c53d-583d-bf60-bfead98f875d", "id": 76, "code": "V7KKKQ", "public_name": "Noam Elfanbaum", "avatar": null, "biography": "Love the Python language and ecosystem, managing Core engineering at Bluevine.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/MVLDXA/", "id": 394, "guid": "b4b8349f-3f9d-5c3b-a065-a5e07a13c54f", "date": "2021-05-02T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-394-fastapi-the-most-modern-python3-web-framework", "title": "FastAPI: The most modern Python3 web framework", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "FastAPI is a modern, high-performance, batteries-included Python web framework that's perfect for building RESTful APIs. It can handle both synchronous and asynchronous requests", "description": "With Python2 deprecated and the rise of Python3, a new world of features and projects have opened up. FastAPI is one of those projects. Heavily inspired by Flask, FastAPI has a lightweight microframework feel with support for Flask-like route decorators. This means that moving from Flask to FastAPI is easy. It takes advantage of Python type hints for parameter declaration which enables data validation (utilizing Pydantic - another great Python3 project) and OpenAPI/Swagger documentation. It's super fast. Since async is much more efficient than the traditional synchronous threading model, it can compete with Node and Go with regards to performance. In addition, it uses uvicorn - the async lightning-fast answer to gunicorn.\r\n\r\nIn this talk I will introduce you to FastAPI, why we at Insidepacket chose to use it as our web framework and how to migrate from Flask to FastAPI", "recording_license": "", "do_not_record": false, "persons": [{"guid": "6f50f772-835a-5230-8e1d-fae7fff7375e", "id": 278, "code": "LJWSEV", "public_name": "Chai Tadmor", "avatar": "https://cfp.pycon.org.il/media/cat_avatar_yAk2AHE.jpg", "biography": "I am a Backend Team Lead for Insidepacket.\r\nMy team is in charge of allowing access to our complex internal system using east-to-use APIs.\r\nI have a passion for APIs and distributed systems", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/JGQ83F/", "id": 409, "guid": "c9a1d3c5-f65f-588d-b7ea-a3c620283384", "date": "2021-05-02T14:30:00+03:00", "start": "14:30", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-409-lock-and-roll-advanced-locks-debugging-with-gdb", "title": "Lock and roll - Advanced locks debugging with GDB", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "In this talk we'll use a real-life use case to learn how extending GDB with Python can help us to solve bugs, all while digging deep into the internals of Python locks and how they're implemented.", "description": "Debugging deadlocks is hard. Debugging deadlock in production is even harder.\r\nThis talk will demonstrate how Python\u2019s state can be debugged in production using GDB and how can we *easily* add it to our debug toolkit. There's much we can do with extending GDB with Python to better understand the internals of the language, and even to customize it to our own debugging needs.\r\nWe'll learn about CPython's locks, how they affect us, and how to debug a multithreaded Python process in real-time using GDB.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "ec4388cf-c4ae-5749-a0ea-e72677eb0471", "id": 235, "code": "8ALBVD", "public_name": "Roee Drucker", "avatar": null, "biography": "A software developer for almost a decade, with a broad experience of many programming languages and technologies. \r\nActing today as a software architect at Claroty", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/L73ENY/", "id": 469, "guid": "0926f347-31cd-5389-8565-a09a9c97ec3e", "date": "2021-05-02T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-469-cutting-the-right-corners-handling-high-cardinality-by-understanding-your-data", "title": "Cutting the Right Corners: Handling High Cardinality by Understanding Your Data", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Handling high cardinality with big data can be challenging. We improved our pipeline speed and stability by understanding which data matters more and creating a smart \u201cCardinality Protector\u201d to reduce cardinality with minimal effect on the data.", "description": "As a marketing analytics platform, Singular handles and ingests billions of user events on a daily basis, along with all the marketing data pertaining to each event: Was an ad clicked? When and where? Which network served that ad? How much did the ad cost? And much more. The data is then aggregated so our customers can use it to make informed decisions in their daily marketing operations.\r\n\r\nAs our operations scaled, we have experienced cases where the sheer number of events, with the large number of columns saved per event, some of which have high cardinality, slowed down our data ingestion pipeline. It ate up CPU, memory and network resources to the point of affecting the user experience. The burden on the system was exacerbated by click spam: a type of fraud where automated tools simulate millions of ad clicks. Click spam increases the already high load on our pipeline. \r\n\r\nOur challenge was to reduce the amount of data we ingest, improve our pipeline speed and stability and provide a better overall user experience. But we couldn't just remove excess rows as all rows are essential -- including ones that represent possibly-fraudulent clicks. Our customers want to measure click spam activity and find out where it originates. However, is it possible to retain the necessary information but still reduce the cardinality of some columns?\r\n\r\nThis was the starting point for what became the \"Cardinality Protector.\" In-depth research into our data helped us prioritize all the columns and metrics by their importance to customers. We then created smart rules in order to cut out some of the most extreme cardinality with minimal effect on the data.\r\n\r\nIn this session, we will show how we applied our cardinality protection logic to improve system performance significantly while minimizing the effect on the data. We'll talk about the challenges we ran into, both in terms of prioritization logic and system resources, and unveil some of the cool tricks we used, with Pandas and on-disk sorting/group-by, to apply cardinality protection to large batches of data.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "341bcac1-e6ff-5f93-b845-74c8fb81fdad", "id": 359, "code": "MSVSNM", "public_name": "Asaf Sarid", "avatar": "https://cfp.pycon.org.il/media/1618830581990.jpeg", "biography": "Senior Software Engineer at Singular. Playing a key role in Singular's Analytics Infrastructure team, with experience in both realtime and batch processing large-scale data pipelines.", "answers": []}], "links": [], "attachments": [], "answers": []}], "PyData Track 1": [{"url": "https://cfp.pycon.org.il/conference2021/talk/DKLWSD/", "id": 513, "guid": "cd4f9868-01f7-5795-b248-9d244497d2d7", "date": "2021-05-02T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-513-malware-representation-using-graphs", "title": "Malware Representation Using Graphs", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "In the world of malware detection, we need to keep innovating all the time to catch the latest APTs. Let's see how can we do it with recent developments in graph analysis using neural networks", "description": "In the last decade, we suffer a new epidemic - **Advanced Persistent Threats** (APTs).\r\nIt seems like every other week a new kind of malware is born and the attack vectors are becoming more and more sophisticated. from pinpoint targeting of specific machines to massive infection of every machine it tackles on its way.\r\nTo be able to cope with a large amount of incidents happening every day in the \u201cEverything is connected\u201d age, assigning a human security researcher on every case is expensive and practically impossible.\r\nAlthough sometimes considered as black magic - In recent years we can see the increasing usage of **machine learning** for malware detection and classification. The suggested solutions, inspired by various fields as Computer Vision and NLP, are implementing cutting edge solutions into the cybersecurity field.\r\nIn this talk, I\u2019ll show how to use graphs to represent malware and how to use graph embeddings and GCN (Graph Convolutional Networks) to tackle such tasks as malware classification and detection, to help security researchers do their job in a faster and more efficient way.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "01e39dae-1224-5451-ae73-5a46f615487b", "id": 291, "code": "ASJUB8", "public_name": "Gal Braun", "avatar": null, "biography": "I'm interested in Data Science & Machine learning focused on explainability, representation learning, and visualizations.\r\nI hold a BSc in Computer Science from the Technion and currently pursuing his MSc in Information Systems Engineering from Ben-Gurion University.\r\nI'm currently a Data Scientist @ SentinelOne, fighting malwares using Machine Learning for the last 4 years.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/ULAKS9/", "id": 479, "guid": "9e0f1663-29d4-5340-8ace-c5fb56623e49", "date": "2021-05-02T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-479-short-text-in-the-wild", "title": "Short Text in the Wild", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "While most of our online lives revolve around short texts, there's very little information on how to apply NLP techniques on such texts. In this talk, I'll share the lessons we learned and the methodology we developed when dealing with short texts.", "description": "\u201cThanks for all the fish\u201d ; \u201cHappy bday grandma!\u201d ; \u201cMercedes C-class Cabriolet\u201d . Looks random, right? Well, maybe you know the old saying \u201cone man\u2019s trash is another woman\u2019s treasure\u201d. These texts, while very short, can be a virtual gold mine for many different business use-cases, some of which we tackle daily in our work. When we started working on unsupervised feature generation from very short texts, we started by looking into what\u2019s already been done in the field, and to our surprise the answer was: not a lot. In this talk we\u2019ll share some insights from our experience in dealing with short texts. We\u2019ll start by defining what we mean by \"short\" in our unique case, why it\u2019s interesting in various domains, where and why advanced out-of-the-box methods failed and finally, provide practical tips for handling short and unusual types of text.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "92fcf2b3-0655-553d-871e-5a75ecab35d9", "id": 367, "code": "P37PTH", "public_name": "Gal Hochma", "avatar": "https://cfp.pycon.org.il/media/IMG_5300_1.jpg", "biography": "Gal is a senior data scientist at PayPal, working mostly on NLP applications and NLP variable generation. She currently heads up construction of an internal NLP infrastructure to feed a wide variety of models in diverse domains like fraud detection, marketing, and Credit risk. Gal holds a BSc in Electrical Engineering and Physics as well as an MSc in Electrical engineering, all from Tel Aviv University. Before PayPal, Gal worked as a guidance and control algorithms engineer for aerospace systems and later as a data scientist on cyber applications.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/LTUZTF/", "id": 431, "guid": "8843efd6-ce15-5ccf-aadf-feb08c14f1b8", "date": "2021-05-02T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-431-practical-optimisation-for-pandas", "title": "Practical Optimisation for Pandas", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "This talk will review some of the most common pitfalls that can cause otherwise perfectly good Pandas code to grind to be too slow for any time-sensitive applications, and walk through a set of tips and tricks to avoid them.", "description": "Writing performant pandas code is not an easy task, in this talk, I will explain how to find the bottlenecks and how to write proper code with computational efficiency, and memory optimization in mind.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "c2e7dab8-2806-5dde-b6c3-61d5da6641b7", "id": 213, "code": "LBFGXR", "public_name": "Eyal Trabelsi", "avatar": "https://cfp.pycon.org.il/media/eyal-trabelsi.png", "biography": "Enthusiastic Software Engineer\ud83d\udc77\r\nWho appreciates good software engineering \ud83d\ude4f\r\nI have a big passion for Python \ud83d\udc0d, Machine Learning \ud83e\udd16 , and Performance Optimisations\ud83e\uddb8", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/9PELET/", "id": 480, "guid": "71edd6f0-4cf5-5169-8f4a-32167b41fbc4", "date": "2021-05-02T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:45", "room": "PyData Track 1", "slug": "conference2021-480-war-stories-when-data-doesn-t-easily-fit-in-memory", "title": "War stories: when data doesn't (easily) fit in memory", "subtitle": "", "track": "PyData", "type": "Talk (long - limited number of slots)", "language": "en", "abstract": "I plan to discuss three archetypical war-stories about fitting in memory. In each of them, I'll describe both the technical challenge and the human biases that needed to be overcome to arrive at sound solutions.", "description": "One aspect of handling big data is that typically a problem's dataset does not naively fit into RAM. Three episodes I'd like to discuss:\r\n- How to chew thousands of >1GB JSON files without swallowing them whole.\r\n- Choosing the right in-memory format for a sparse shortest-path matrix, when the dense version would be prohibitively big,\r\n- Choosing a data-at-rest format for large dataset without reinventing the wheel.\r\n \r\nI'll discuss the problems, their solutions and the mistakes I made along the way", "recording_license": "", "do_not_record": false, "persons": [{"guid": "6a309ebd-3191-54aa-8cea-2d92f2681a39", "id": 368, "code": "BLX8ZD", "public_name": "Uri Yanover", "avatar": "https://cfp.pycon.org.il/media/Uri_face.jpg", "biography": "Uri has 15 years of Python experience, as an hands-on engineer and project technical leader.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/TVYDJA/", "id": 418, "guid": "ffe4e7ec-8be5-543c-90c4-915ff0d5cbcd", "date": "2021-05-02T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-418-harnessing-data-to-improve-healthcare", "title": "Harnessing Data to Improve Healthcare", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "In this talk, you will learn how at Diagnostic Robotics we create insights from claims data, a form of administrative data at large scale, which provides a great opportunity for AI in healthcare. You will understand how we use medical code embeddings", "description": "In this talk, you will learn how at Diagnostic Robotics we create insights from claims data, a form of administrative data at large scale, which provides a great opportunity for AI in healthcare. You will understand how we use medical code embeddings and deep learning methods to build predictive proactive models that benefit the patients and reduce the cost of healthcare. We will also discuss the concept of causal machine learning, its use to emulate randomised controlled trials and see how it\u2019s related to our models.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "dfc41e04-c728-5e07-b54f-744579bff0ab", "id": 327, "code": "ABK7FZ", "public_name": "Noa Lubin", "avatar": null, "biography": "Noa is a Machine Learning Researcher at Diagnostic Robotics. She previously worked at Amazon, NASA, Elbit Systems, and the Israeli Aerospace Industry.  Noa has an MSc in Computer Science from Bar-Ilan University (Magna Cum Laude) with an NLP thesis advised by Prof. Yoav Goldberg. Her Electrical Engineering BSc is from the Technion (Summa Cum Laude).", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/WG3HXJ/", "id": 485, "guid": "54066e95-b74f-5b63-97f0-1b71b17a7cc8", "date": "2021-05-02T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-485-genomic-data-cost-effective-scaling-in-the-cloud", "title": "Genomic data - cost-effective scaling in the cloud", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Genomic sequencing and processing data amounts to many terabytes of data. We'll present how single-cell processing pipe-line requires strong/eventual consistency trade-offs which are different from traditional big-data systems.", "description": "immunai runs a complex single-cell RNA sequencing pipe-line. The computational-biology and machine-learning tools  eco-system revolves around R and Python. We use cost-effective cloud-storage for the large sequencing files while combining them with strongly consistent meta-data. R/python API users can retrieve the data indexed by any application defined set of labels/features. We will discuss the tradeoffs compared to other big-data platforms like Apache Spark, Elastic Search etc.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "d8b77559-3bf8-5115-a627-b7bf9ea3b0f6", "id": 371, "code": "HQXWGT", "public_name": "Tal Franji", "avatar": null, "biography": "Working in bug-data for more than 10 years, programming for 40 years. Interested in system design, programming languages, big-data and data analytics. Consultant, VP R&D, Xoogler. Various industries - ad-tech, fin-tech, cyber and more.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/SE3H7M/", "id": 445, "guid": "09586d65-e795-5516-aa96-d062169b068d", "date": "2021-05-02T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-445-testing-stochastic-ai-models-with-hypothesis", "title": "Testing stochastic AI models with hypothesis", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "In this talk, I will cover shortly the theory of property-based testing and then jump into use cases and live examples to demonstrate the hypothesis library and how we used it to generate random examples of plausible edge cases of our AI model.", "description": "Over the years, testing has become one of the main focus areas in development teams, a good feature is a well tested one. In the field of AI this is many times a real struggle. Since eventually most advanced AI models are stochastic - we can\u2019t manually define all their possible edge cases. This led us to use the hypothesis library which does a lot of that for you, while you can focus on defining the properties and specifications of your system.\r\n\r\nIn this talk, I will cover shortly the theory of property-based testing and then jump into use cases and live examples to demonstrate how we used the hypothesis library to generate random examples of plausible edge cases of our AI model.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "ac0e31d4-6afe-5c4a-a0d9-c763f2f3960d", "id": 346, "code": "LLTM9X", "public_name": "Marina Shvartz", "avatar": "https://cfp.pycon.org.il/media/WhatsApp_Image_2021-02-07_at_9.08.19_AM.jpeg", "biography": "Marina is a senior A.I. software architect at Aidoc, working on building scalable A.I. infrastructure for the research and development of cutting edge A.I. algorithms in the field of healthcare.  She has 9 years of experience as a software engineer and software team leader, working both in corporate companies and startups. \r\nShe's passionate about finding the best architecture and solutions to software problems and mentoring others with her experience and knowledge.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/VDCMGW/", "id": 435, "guid": "c19c7602-0387-534f-9ebe-3137b5bd5f9b", "date": "2021-05-02T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:45", "room": "PyData Track 1", "slug": "conference2021-435-tutorial-using-python-holoviz-technologies-to-create-interactive-presentations", "title": "Tutorial: Using Python HoloViz Technologies to Create Interactive Presentations", "subtitle": "", "track": "PyData", "type": "Talk (long - limited number of slots)", "language": "en", "abstract": "The tutorial will introduce  two interactive plots libraries: HoloViews, and panel and show how those can be used to create static html files with interactive graphics", "description": "The HoloViz project provides a set of Python libraries for high-level visualization of complex datasets. They are particularly useful for handling big data and multi-dimensional data that is common in machine-learning applications.\r\nHoloViz technologies support multiple graphical engine backends and integrate seamlessly with flexible development and deployment environments like Jupyter notebooks and modern web browsers. The visualization outputs are interactive, with features such as widgets like sliders or selection boxes or hover tools to inspect data, while not requiring any JavaScript, HTML, CSS, or other web-technology expertise.\r\nThis tutorial will focus on two HoloViz libraries:\r\n\r\nHoloViews: high level interface providing plots (heat maps, histograms, spikes, etc.) in many spatial and temporal combinations, with or without widgets for selecting along dimensions\r\nPanel: simple application and dashboard creation from images, plots, Markdown, LaTeX, and other elements into one HTML page incorporating interactive tabs and widgets.\r\n\r\nDuring the tutorial an interactive presentation will be constructed to show the attendees how to construct their own interactive poster / presentation.\r\nSample References:\r\n\u2022 HoloViz web site: https://holoviz.org\r\n\u2022 HoloViz on Github: https://github.com/holoviz/holoviz\r\n\u2022 Jacob Barhak, Joshua Schertz, Visualizing Machine Learning of Units of Measure using PyViz, PyData Austin 2019, 6-7 December 2019, Galvanize Austin. Presentation: https://jacob-barhak.github.io/Presentation_PyData_Austin_2019.html  Video: https://youtu.be/KS-sRpUvnD0", "recording_license": "", "do_not_record": false, "persons": [{"guid": "104e8551-ec7b-5bb4-bfd1-842c7291808f", "id": 86, "code": "KHJEYK", "public_name": "Jacob Barhak", "avatar": "https://cfp.pycon.org.il/media/PirctureOfMyselfCropped_nuLYVjW.jpg", "biography": "Jacob Barhak is a Computational Disease Modeler focusing on machine comprehension of clinical data. The Reference Model for disease progression that is the most validated Diabetes model known worldwide and also applied to model COVID-19 was self developed by Dr. Barhak as an independent researcher. His efforts include standardizing clinical data through ClinicalUnitMapping.com. He is the developer of the Micro Simulation Tool (MIST). Dr. Barhak has diverse international background in engineering and computing science. He is active within the python community and runs the Austin Evening of Python Coding meetup. For additional information please visit http://sites.google.com/site/jacobbarhak/", "answers": []}, {"guid": "b853e44a-7b56-5cf8-9f56-1a87bfc999ea", "id": 338, "code": "DGWKFF", "public_name": "James A.  Bednar", "avatar": "https://cfp.pycon.org.il/media/James_Bednar-750x750_crop.jpg", "biography": "Jim Bednar is the Director of Technical Consulting at Anaconda, Inc. Dr. Bednar holds a Ph.D. in Computer Science from the University of Texas, along with degrees in Electrical Engineering and Philosophy. He has published more than 50 papers and books about the visual system and about software development. Dr. Bednar manages the open source Python projects HoloViz, Panel, hvPlot, Datashader, HoloViews, GeoViews, Param, and Colorcet. Before Anaconda, Dr. Bednar was a lecturer and researcher in Computational Neuroscience at the University of Edinburgh, Scotland, as well as a software and hardware engineer at National Instruments.", "answers": []}], "links": [], "attachments": [], "answers": []}], "PyData Track 2": [{"url": "https://cfp.pycon.org.il/conference2021/talk/EYEWSE/", "id": 459, "guid": "b960b9ae-494e-5142-b04e-d80518b5e586", "date": "2021-05-02T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-459-a-feature-store-what-is-it-good-for-", "title": "A Feature Store - what is it good for?", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "It\u2019s good for feature reuse in machine learning, thereby increasing data science accuracy, velocity, and visibility.", "description": "A feature store is a single interface to create, discover, and access features for model training and inference. A wholistic feature store solution containing both storage and transformation layers would ideally include:  \r\n\r\n* Ingestion - both from streams and batch jobs\r\n* Serving - low latency single features for inference and high throughput bulk features for training\r\n* Transformation / Aggregation logic\r\n* Discovery - features and how to retrieve them\r\n\r\nThis session will attempt to demonstrate why a feature store is useful, review current solutions, and provide a number of tips on getting started.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "206d078b-b5fc-5b24-9afa-61cdd88a0083", "id": 202, "code": "NBQVTX", "public_name": "Orr Shilon", "avatar": "https://cfp.pycon.org.il/media/personal.jpg", "biography": "Orr is a ML Engineering Team Lead at Lemonade, currently developing a unified ML Platform. His team\u2019s work aims to increase development velocity, improve accuracy, and promote visibility into machine learning at Lemonade.\r\n\r\nPreviously, Orr worked at Twiggle on semantic search, at Varonis, and at Intel. He holds a B.Sc. in Computer Science and Psychology from Tel Aviv University.\r\n\r\nOrr also enjoys trail running and sometimes races competitively.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/YXFHMY/", "id": 476, "guid": "27afc517-73b3-5425-bf56-537a95339f7b", "date": "2021-05-02T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-476-methods-for-effective-online-testing-in-python", "title": "Methods for Effective Online Testing in Python", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "In online advertising, we run a lot of online tests to determine which approach boosts our engagement the most. We talk about different ways of online testing through the perspective of a new feature we developed that is based on continuous testing.", "description": "Testing different UI components, algorithms, optimization approaches in an effort to boost engagement is becoming more and more prominent in online applications. In this talk we will introduce a feature that is based on continuous online testing, Then we will go over online testing in general and some methods that we can use based on certain constraints of the domain. For instance having a limited time to decide which test group we want to use, to avoid having the test itself affecting the results. Sometimes we are also constrained by deadlines by which we have to conclude testing. In tests like those, we have to balance exploration and exploitation to maximize the test\u2019s payout while still being certain in what we did. With that in mind will explore different methods of running online tests, namely split tests, epsilon-greedy multi-armed bandits, and Thompson sampling. We will go over their pros and cons, and applications. After a short demonstration written in python. We will conclude the talk with reasoning of why we chose the methods that we did.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "51a56cc6-b314-5aa9-b9fd-090584e24934", "id": 365, "code": "SJUAHZ", "public_name": "Luka Androjna", "avatar": "https://cfp.pycon.org.il/media/image.png", "biography": "Data Scientist at Zemanta an Outbrain Company", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/BV3PDJ/", "id": 456, "guid": "35efa740-82f2-52b9-ac4a-c1d6757749de", "date": "2021-05-02T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-456-mapping-and-analysis-of-geospatial-big-data-using-geemap-and-google-earth-engine", "title": "mapping and analysis of geospatial big data using geemap and Google Earth Engine", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Google Earth Engine (GEE) is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets. It enables to analysis and visualizes changes on the Earth\u2019s surface using python API,", "description": "Google Earth Engine (GEE) is a cloud computing platform with a multi-petabyte catalog of satellite imagery and geospatial datasets\r\nWith the new geemap Python package GEE users can easily manipulate, analyze, and visualize geospatial big data interactively in a Jupyter-based environment. \r\n\r\nThe topics will be covered in this lecture include: \r\n(1) Brief introduction satellite imagery,\r\n(2)introducing the Earth Engine Python API and the new  geemap  Python package.\r\n(2) searching GEE data catalog .\r\n(3) displaying GEE datasets.\r\n(4) classifying images using machine learning algorithms. \r\n(5) Finding the greenest place in Israel in terms amount of vegetation.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "5acd2ec3-018f-59de-90ee-42640314516f", "id": 268, "code": "HEK8XJ", "public_name": "yaron Michl", "avatar": "https://cfp.pycon.org.il/media/screen-shot-2021-01-28-at-9-29-20.png", "biography": "Yaron Michael is a geospatial and Remote sensing expert, he deals with projects from public health(i.e environmental influences of pollution on infant health), to Precise agriculture(i.e satellite imaging to improve crop yield) as well as public transport. This year he will finish his Ph.D. on the subject of predicting forest fires with the integration of Numerical models. In addition, he is currently assisting an international project whose purpose is to find new tourist sites In Europe and Israel using machine learning.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/EWALKX/", "id": 392, "guid": "fd1331cb-e404-5674-85d9-06b6d73e3b52", "date": "2021-05-02T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-392-python-bottleneck-optimization-progression-from-lists-to-cupy-arrays", "title": "Python bottleneck optimization - progression from lists to cupy arrays", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "In optimization problems speed is important, but unfortunately python isn't optimized to speed. In this talk I'll show how to use python and optimize bottleneck functions to be as fast as possible using different libraries and methods.", "description": "In this talk I'll present how to optimize the running time of a bottleneck function, progressing from using python lists to cupy's arrays. CuPy is a relatively new library that allows running calculations on the GPU using an API similar to NumPy.\r\n\r\nI'll cover a few optimization techniques such as vectorized data structures, a-priori calculations and parallel operations. \r\nI will also showcase how to time the function and simple profiling.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "7c564312-42f4-5c97-935f-c70e34d0c322", "id": 312, "code": "WSZ3XS", "public_name": "Yair beer", "avatar": null, "biography": "I use machine learning in my daily job as an Senior data scientist, and in addition I also compete on kaggle.com. I am interested in mostly in high dimensional data and data analysis that require smart preprocessing and data manipulation in order to get the required results. I enjoy solving problems and constantly learning new things.\r\n\r\nI have hands-on experience in both R and python for machine-learning, and enjoy learning other languages (Rust, Golang, C)", "answers": []}], "links": [], "attachments": [], "answers": []}]}}, {"index": 2, "date": "2021-05-03", "day_start": "2021-05-03T04:00:00+03:00", "day_end": "2021-05-04T03:59:00+03:00", "rooms": {"General Track 1": [{"url": "https://cfp.pycon.org.il/conference2021/talk/NQYDJZ/", "id": 493, "guid": "8d3946c0-39ca-5c4c-9914-ec75a3c65a67", "date": "2021-05-03T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-493-when-is-an-exception-not-an-exception-using-warnings-in-python", "title": "When is an exception not an exception? Using warnings in Python", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Python's warnings are exceptions \u2014 but they're also distinct from exceptions, and are both used and trapped differently. In this talk, I'll introduce warnings, how to raise, trap, and redirect them, and show you best practices for their use.", "description": "If your code encounters a big problem, then you probably want to raise an exception. But what should your code do if it finds a small problem, one that shouldn't be ignored, but that doesn't merit an exception? Python's answer to this question is warnings.\r\n\r\nIn this talk, I'll introduce Python's warnings, close cousins to exceptions but still distinct from them. We'll see how you can generate warnings, and what happens when you do. But then we'll dig deeper, looking at how you can filter and redirect warnings, telling Python which types of warnings you want to see, and which you want to hide.  We'll also see how you can get truly fancy, turning some warnings into (potentially fatal) exceptions and handling certain types with custom callback functions.\r\n\r\nAfter this talk, you'll be able to take advantage of Python's warning system, letting your users know when something is wrong without having to choose between \"print\" and a full-blown exception.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "857807f5-4dc9-5ecb-b8cd-94148db64a13", "id": 219, "code": "9M8KZ3", "public_name": "Reuven Lerner", "avatar": "https://cfp.pycon.org.il/media/Reuven.Lerner.jpeg", "biography": "Reuven Lerner is a full-time Python trainer. In a given year, he teaches courses at companies in the United States, Europe, Israel, India, and China \u2014 as well as to people around the world, via his online courses, including Weekly Python Exercise.\r\n\r\nReuven\u2019s most recent book is \u201cPython Workout,\u201d a collection of Python exercises with extensive explanations, published by Manning.  He is currently working on \"Pandas Workout,\" a similar collection of exercises for Pandas.\r\n\r\nReuven\u2019s free, weekly Better developers newsletter, about Python and software engineering, is read by more than 20,000 developers around the globe. His \u201cTrainer weekly\u201d newsletter is similarly popular among people who give corporate training.\r\n\r\nReuven has a bachelor\u2019s degree in computer science and engineering from MIT, and a PhD in learning sciences from Northwestern University. He lives in Modi\u2019in, Israel with his wife and three children.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/A9CFRM/", "id": 521, "guid": "2acc0d92-87b6-52ea-bfa1-a6c74b25fa52", "date": "2021-05-03T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-521-leaving-celery-in-the-dust-how-to-truly-scale-in-production", "title": "Leaving Celery in the Dust: How to truly scale in production", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "When we first developed our system, we picked Celery due to its wide community adoption. When we started scaling our systems, we realized Celery was pulling us back from many different angles. We decided to replace Celery with our own technology.", "description": "Back in the days at Intsights, we architectured our platform based on a distributed task queue approach. Looking for an available library and devices to support our approach, we met Celery. According to Celery's documentation, `Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages while providing operations with the tools required to maintain such a system.\r\nIt's a task queue with a focus on real-time processing while also supporting task scheduling.` \r\n\r\nFor Intsights, Celery did not live to its promise. It did not scale and was highly bloated with metrics and communication overhead. Nonetheless, Celery did not introduce enough thread/process safety to handle problematic workloads that might fail, crash or get stuck on special occasions, such as a stuck GIL due to an infinite Regex.\r\n\r\nAt some point, we realized that our only option is to develop our own solution. We decided to stop chasing Celery bugs and be focused on what fits Intsights best. At first, we were inspired a lot by Celery's design. We implemented result backends, timeouts, and pipelines. We stuck to Celery's terminology to make the migration easier. Later we ditched most practices and introduced our own.\r\n\r\nToday, we have a high performant, highly stable, and safe library that supports our use case in a perfect manner. Sergeant is meant to be very simple, very fast, very stable, and safe. Still, many features are missing or left out. We only support Mongo and Redis as backends. We do not guarantee consistency of task order and consumption. These compromises let us stay very simple to maintain and to focus on stability and performance. The library supports only Python 3.6> and provides full type annotations and test coverage.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "e90bc2d7-2c45-5bfc-92e1-07632451b371", "id": 391, "code": "ZQFZZG", "public_name": "Gal Ben David", "avatar": "https://cfp.pycon.org.il/media/profile_FFH4v3S.jpg", "biography": "Gal Ben David is the Chief Technology Officer & Co-Founder of IntSights. As CTO, Gal leads all development and engineering initiatives, including development methodologies, quality assurance, and research and development. Gal is an expert in cyber intelligence and software development, serving 5 years as an Intelligence Officer in the 8200 Unit of the Israel Defense Forces.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/MMCHRJ/", "id": 518, "guid": "491f518c-b58a-5a56-87c4-9cc02ea7464a", "date": "2021-05-03T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-518-building-a-secure-python-cloud-application-from-scratch", "title": "Building a Secure Python Cloud Application from scratch", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "When is it the right time to implement security when building an app? In this talk, you will learn how to build from scratch a secure Python application hosted in the cloud, the major attack vectors and tools you need to remediate to the main risks.", "description": "When do you think is the right moment to worry about the security of the application you develop in the cloud? The first time your customer requires it to buy your product, or should you just wait for the first security incident? I strongly believe that it is never too early to think and act towards securing your environment and your product, otherwise security becomes some unmanaged technical debt that just accumulates with every single line of code. \r\n\r\nIn this talk, you will learn how to build a secure Python application hosted in the cloud from scratch. You will discover what are the main common threats and attack vectors you need to fight, the tools you need to leverage to remediate to the most critical risks and how to continuously monitor the security of your application and environment.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "1ed07efa-c2ba-5db7-aa41-72324279bf91", "id": 386, "code": "ARXN8Z", "public_name": "David Melamed", "avatar": "https://cfp.pycon.org.il/media/david_melamed.jpeg", "biography": "Co-Founder and CTO at CBrix, David is passionate about technology since he was a kid. For the last 20 years, he enjoys building complex applications in the cloud. He worked at Cloudlock (acquired by Cisco) and Cisco in the CTO Office where he was leading the technological innovation and became a Cloud Security expert. He has been also involved in various communities in Israel like PyCon Israel in the past and currently the AWS Builder Space Community.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/Z3YRWY/", "id": 515, "guid": "aa3226e7-7f2d-53ed-9565-5ccaaa3d7058", "date": "2021-05-03T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-515-cleaner-sw-architecture-using-python-annotations", "title": "Cleaner SW architecture using python annotations", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "While many developers struggle with the question, \u201cshould or shouldn\u2019t I use python annotations?\u201d. I would demonstrate how proper usage of python annotations guide the developers to refine the structure of the written code.", "description": "In the \u201cThe Clean Architecture\u201d article, Uncle Bob explains the fundamentals principles of \u201cclean\u201d and fine coding. In this lecture, I would show how using python annotations helps the developer to follow those rules. \r\nPython annotations are considered by many as redundant or nice to have, yet I will demonstrate how Python annotations, along with making the code more readable also enhances the chosen programming structure.\r\nI will start the session with a brief introduction to what is considered, by uncle Bob, to be a \u201cclean coding architecture\u201d and continue with practical day to day examples.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "7d0d3307-eeae-51cb-aeae-72f87cd9b49e", "id": 264, "code": "3D97QB", "public_name": "Yehuda Levian", "avatar": "https://cfp.pycon.org.il/media/WhatsApp_Image_2020-02-28_at_10.36.37.jpeg", "biography": "* 5 years as a software developer in Intel.\r\n* 2.5 years in Imubit, 1 year as a full stuck developer and another 1.5 years as DevOps.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/WETCLD/", "id": 507, "guid": "9d942879-c0bc-5600-af87-a5dbb4e81fd9", "date": "2021-05-03T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-507-be-a-pythonista-coding-and-life-lessons-learned-from-python", "title": "Be a Pythonista: Coding and Life Lessons Learned from Python", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Programming requires a logical mindset, which can be used to introduce strategy into your daily life. Join me, as we review pythonic best practices, constructs and concepts and see how to take advantage of them both at and away from the keyboard.", "description": "Most people think of programming as a technological medium to accomplish a task. While this is definitely true, there is a lot more that you can get out of python than for loops and dictionaries. Thinking like a python developer enables you not only to code better, but also to incorporate programming best practices and strategies in your daily life. It also doesn't matter what type of application or script you are writing, by learning python concepts, constructs and best practices, you will be able to take full advantage of what the language has to offer.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "44cfd07e-32e4-5fc8-9b9c-9b07b9f15f90", "id": 383, "code": "V7D87B", "public_name": "Hodaya Stern", "avatar": "https://cfp.pycon.org.il/media/Profile.jpg", "biography": "I'm a security researcher and a python enthusiast, 8200 unit alumna. I taught Python for various levels in IDF and I currently teach Python at she codes; women community.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/HGZNAY/", "id": 510, "guid": "2ae99c6a-c32d-5f51-862f-72132b9d9063", "date": "2021-05-03T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-510-websockets-and-flask-for-the-real-world", "title": "WebSockets and Flask for the real world", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "Implementing a Flask realtime web application for production isn\u2019t as easy as it seems.\r\nLearn how to use Redis Pub/Sub, Ngnix, uWSGI, signaling, unix socket, mule process, socket.io and more to create a robust realtime app.", "description": "socket.io enables real-time, bidirectional, event-based communication between the browser and the server.\r\nIdeally Pythonists running a Flask application would simply use the Flask-SocketIO library, yet Flask alone is not suitable for production and must be hosted by a real web server. Thus requiring additional development to enable the usage of socket.io.\r\n\r\nOur Framework consists of a uWSGI server running Flask instances and other services behind a Nginx proxy. We will share a full working solution of a framework setup that supports Flask realtime web application in production.\r\n\r\nThe framework includes Redis Pub/Sub to publish events from any service, a mule service to listen to events, uWSGI signaling to notify all workers, socket.io on a Redis backend \r\nto allow lazy-apps, a new Nginx mapping and another http listener in the uWSGI server.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "3521d998-1e91-536f-b31d-d8c32aa5f338", "id": 81, "code": "LNKWBH", "public_name": "Yael Green", "avatar": "https://cfp.pycon.org.il/media/profile_picture.png", "biography": "Senior Software Engineer,\r\nTech lead at DLPC application team at Imubit", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/SM9WFK/", "id": 422, "guid": "47b331a2-5eb1-577f-9724-eaecfb63659f", "date": "2021-05-03T14:30:00+03:00", "start": "14:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-422-practical-advice-for-using-mypy", "title": "Practical Advice for Using Mypy", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "In recent year Mypy gained wide spread adoption, and as it continues to improve and evolve, more and more useful features are being added.\r\n\r\nIn this talk I'll preset some gems in the type system you can use to make your code better and safer!", "description": "The Mypy typing system, and the complementary extensions module, includes some powerful but lesser known features such immutable types, typed dicts, union types and exhaustiveness checking. Using these advanced features, developers can declare more accurate types, get better warnings, produce better code and be more productive.\r\n\r\nIn the talk I'm going to demonstrate the following:\r\n\r\n- Basic type annotations (primitives, collections types, etc.)\r\n- How the syntax evolved from Python 2 comments to new features planned for Python 3.10\r\n- Using TypedDict and dataclasses for readability\r\n- Immutable types (e.g List / Dict vs Sequence / Mapping), examples and motivation\r\n- Type narrowing, and how it can be used to achieve exhaustiveness checking\r\n\r\nTo demonstrate these topics I'll follow along an example of a real system, where in each step I present a problem and demonstrate how it can be solved using Mypy.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "6e53e03e-a417-55be-b357-0de6ca039b4e", "id": 123, "code": "S37DEW", "public_name": "Haki Benita", "avatar": "https://cfp.pycon.org.il/media/haki-benita.png", "biography": "Haki is a software developer and a technical lead. He takes special interest in databases, web development, software design and performance tuning. Haki also writes about development and performance in his blog [hakibenita.com](https://hakibenita.com).", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/UPUHBZ/", "id": 430, "guid": "c1930358-7855-5864-960a-38829b0fdab8", "date": "2021-05-03T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-430-owasp-top-10-in-20-application-security-for-the-average-pythonista", "title": "OWASP Top 10 in 20: Application Security for the average Pythonista", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "This talk might give you what you need to secure your python application from OWASP top 10 vulnerabilities. We\u2019ll look at examples, tools and quick tips for a more robust code base.", "description": "In this hands-on talk, Ronnie Sheer, Head of R&D Hiverr(a Team8 startup) walks through real examples of OWASP top 10 Web Application Security Risks in Python applications. We will then look at small changes you may introduce to your codebase right away to make it more robust. Finally, you may start leveraging OWASP top ten to create a culture of secure coding. Securing Python applications can be an overwhelming task. Leveraging OWASP top ten is a great starting point.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "45d50e38-a326-5e03-89cb-a6ba2ca0dfee", "id": 77, "code": "ZPXGTA", "public_name": "Ronnie Sheer", "avatar": "https://cfp.pycon.org.il/media/1649862887677.jpeg", "biography": "Ronnie Sheer is Head of R&D in Hiverr, and new Ed-Tech startup by Team8. He has been chosen by LinkedIn as an instructor in the most widely distributed online learning platform(LinkedIn learning). He has had the honor of speaking in  prior Pycon Israel events.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/VKPEP3/", "id": 427, "guid": "17ec7d12-18f7-5c37-978b-1b461e82a416", "date": "2021-05-03T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:25", "room": "General Track 1", "slug": "conference2021-427-python-monorepos-what-why-and-how", "title": "Python monorepos: what, why and how", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "This talk will describe the monorepo codebase architecture, explain why you might want to use it for your Python code, and what kind of tooling you need to work effectively in it.", "description": "As organizations and repos grow, we have to choose how to manage codebases in a scalable way.  We have two architectural alternatives:\r\n\r\n- _Multirepo:_ split the codebase into increasing numbers of small repos, along team or project boundaries.\r\n- _Monorepo:_ Maintain one large repository containing code for many projects and libraries, with multiple teams collaborating across it. \r\n\r\nIn this talk we'll discuss the pros and cons of monorepos for Python codebases, and the kinds of tooling and processes we can use to make working in a Python monorepo effective.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "db22a5fc-07c1-5774-9f5b-22c55f23fc3f", "id": 330, "code": "AWFYCH", "public_name": "Benjy Weinberger", "avatar": null, "biography": "Benjy Weinberger is a software engineer with over 20 years' experience in building scalable distributed systems, and one of the creators of the Pants open-source build system. He is a graduate of the Hebrew University in Jerusalem, and indulges his longstanding interest in cutting-edge build systems as a co-founder of Toolchain Labs.", "answers": []}], "links": [], "attachments": [], "answers": []}], "General Track 2": [{"url": "https://cfp.pycon.org.il/conference2021/talk/A7AV8Z/", "id": 519, "guid": "ab449798-64da-54b4-a57c-2870a2e3c399", "date": "2021-05-03T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-519-liberate-your-api-building-a-task-manager-inside-sanic", "title": "Liberate your API: Building a task manager inside Sanic", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "An inside look at some of the tools inside Sanic to help build a background task manager.", "description": "You are building an API when, inevitably, you realize that certain requests are super slow to respond. It dawns on you that you really need to push some work off to a background process. How should you do it?\r\n\r\nWe will explore some of the tools that exist inside the Sanic framework that will enable us to do just this. From the simple task, to complex multi-node cluster: we will look at different strategies to determine the most appropriate tool for the job. Think celery, except entirely within Sanic.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "5c9d502f-b00e-505f-b525-59064b2f7d5d", "id": 248, "code": "QNJ39R", "public_name": "Adam Hopkins", "avatar": null, "biography": "```python\r\nclass Adam:\r\n\r\n\tdef __init__(self):\r\n\t\tself.work = PacketFabric(\"Lead Sr. Software Engineer\")\r\n\t\tself.oss = Sanic(\"Core Maintainer\")\r\n\t\tself.home = Israel(\"Negev\")\r\n\r\n\tasync def run(self, inputs: Union[Pretzels, Coffee]) -> None:\r\n\t\twhile True:\r\n\t\t\tawait self.work.do(inputs)\r\n\t\t\tawait self.oss.do(inputs)\r\n\t\t\r\n\tdef sleep(self):\r\n\t\traise NotImplementedError\r\n\r\n```", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/FKAAJA/", "id": 517, "guid": "40924f06-9617-5a30-a199-3b8e51f63e2c", "date": "2021-05-03T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-517-geographic-data-an-introductory-tale", "title": "Geographic Data - an Introductory Tale", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "An introduction to Geographic Data, some of its basic concepts and common Python tools for working with it.", "description": "At some point in life there might come a time where you might need to work with geographic data.\r\nInstead of waiting in dread for that day, it's best to be prepared!\r\n\r\nIn my talk I'll explain a bit about geo-data types, formats and concepts, and existing Python tools to easily work with it - all while working through a fun and quirky sample case that we'll solve together.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "c6ce7aaa-3a61-52e3-80f8-266a5cd9f067", "id": 139, "code": "XMYJZE", "public_name": "Adam Kariv", "avatar": "https://cfp.pycon.org.il/media/avatars/cd511289b5773fff5e7efe328846eef3_8KJ9BN9.jpg", "biography": "Adam Kariv is an open data consultant and activist. He has over 25 years of experience in developing, designing and managing software projects, from open-source data-wrangling libraries to enterprise-scale, mission-critical systems developed by over 20 engineers.\r\nHe is the founder of the Public Knowledge Workshop (\u2018Hasadna\u2019), an Israeli NGO working to make government data more accessible, and has extensive experience working with government agencies to build better tools for publishing data.\r\nFormer professional titles include Engineering Lead at the international Open Knowledge Foundation (OKF) and Senior Engineer at the data management company Datopian.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/D9FEYX/", "id": 483, "guid": "bdb8a564-ffb1-5655-b3ea-69144d2953c2", "date": "2021-05-03T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-483-how-to-test-microservices", "title": "How to Test Microservices", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "I'll present a tiered approach that allows testing microservices quickly and thoroughly. The tests use stateful mocks of other services, and thus allow concise tests as well as simulating outages, subtle timing problems and large datasets.", "description": "Microservices are fantastic, but a pain to test in complex interaction scenarios. Unit tests are quick and easy, but don\u2019t cover interactions. System integration tests are a standard way to address complexity, but take a huge effort to maintain and a lot of resources to run. How can we get the best of both worlds?\r\n\r\nIn this talk, I\u2019ll present a tiered approach that enables writing tests quickly without sacrificing coverage. The tests use stateful mocks of other services mediated by a verification layer. I\u2019ll talk about how this approach allows testing of insidious failure modes, such as failures within dependencies and narrow race conditions, both of which are almost impossible to achieve in integration tests.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "c66c17f3-37d7-50a2-89ad-193c2df94769", "id": 284, "code": "FDEL8Z", "public_name": "Lior Segev", "avatar": null, "biography": "After earning an MSc in computer science from Tel Aviv University I've worked in several interesting start-ups, including XIV, Stratoscale and now Immunai.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/ZFRN8Y/", "id": 408, "guid": "7767cfa6-12e8-5dc7-a1af-d2d87b1087c8", "date": "2021-05-03T14:30:00+03:00", "start": "14:30", "logo": null, "duration": "00:25", "room": "General Track 2", "slug": "conference2021-408-novel-approach-of-collecting-and-analyzing-data-from-pytest-with-elasticsearch", "title": "Novel approach of collecting and analyzing data from PyTest with Elasticsearch", "subtitle": "", "track": "General", "type": "Talk (regular)", "language": "en", "abstract": "CI/CD is critical for rapid software development, requiring advanced monitoring and logging infrastructure. We will present our PyTest integration with Elasticsearch, leading to significant debug reduction time and infra/product health improvements.", "description": "We will present our methods of integrating Python with the Elasticsearch database by using PyTest plugins and other advanced PyTest features. The Python + PyTest infrastructure allows us to gather useful data such as test coverage, infrastructure stability monitors, product health and debug information. We will go over the three different data levels that we are using: the CI/CD Infrastructure, test flow, and validation test coverage. In addition, we will share how this data enables us to achieve a faster, more stable CI/CD, leading to more efficient development and release cycles. Our system is based on Python PyTest and open source tools that can be run via cloud provider or local servers.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "d4abe2a3-bb91-5a96-a3d7-a9831e8a36ef", "id": 218, "code": "PWRMUM", "public_name": "Noy Nakash", "avatar": "https://cfp.pycon.org.il/media/noy-picture_Kz3SRAJ.jpg", "biography": "Head of SW Tools, DevOps and Validation\r\n\r\nFormed a team of software, validation and DevOps engineers, responsible for:\r\n\r\nThe design of an entire SW backend, automation, CI/CD and tools, spanning from AI frameworks, Python to low level SW (Drivers, FW)\r\n\r\nValidation of a cutting-edge AI processor and its entire software stack (SDK, AI algorithm, FW, drivers and more) running on multiple pre-silicon and embedded platforms.\r\n\r\nDevelopment of business and customer tools such as building packages, Yocto distributions, benchmark tools, SDK CMD and more.", "answers": []}, {"guid": "ce9d69f3-ea61-5ccd-915f-6960f41dcee2", "id": 424, "code": "DRNDHC", "public_name": "Avi Naftalis", "avatar": null, "biography": null, "answers": []}], "links": [], "attachments": [], "answers": []}], "PyData Track 1": [{"url": "https://cfp.pycon.org.il/conference2021/talk/3B9NCX/", "id": 504, "guid": "1175f227-30e0-5c12-a920-d95d684202b5", "date": "2021-05-03T10:00:00+03:00", "start": "10:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-504-deep-learning-minus-the-boilerplate-with-pytorch-lightning", "title": "Deep Learning, Minus the Boilerplate with PyTorch Lightning", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "This talk introduces PyTorch Lightning, outline its core design philosophy, and provides inline examples of how this philosophy enables more reproducible and production-capable deep learning code.", "description": "PyTorch Lightning reduces the engineering boilerplate and resources required to implement state-of-the-art AI. Organizing PyTorch code with Lightning, enables seamless training on multiple-GPUs, TPUs, CPUs as well as the use of difficult to implement best practices such as model sharding, 16-bit precision and more, without any code changes. This talk introduces PyTorch Lightning, outline its core design philosophy, and provides inline examples of how this philosophy enables more reproducible and production-capable deep learning code based on work the following post https://opendatascience.com/pytorch-lightning-from-research-to-production-minus-the-boilerplate/", "recording_license": "", "do_not_record": false, "persons": [{"guid": "e7529556-91d8-5e75-bd75-d4630c9e385d", "id": 381, "code": "ADWDDY", "public_name": "Ari Bornstein", "avatar": "https://cfp.pycon.org.il/media/1606669548473.jpeg", "biography": "Aaron (Ari) Bornstein is an AI researcher with a passion for history, engaging with new technologies and computational medicine. As Head of Developer Advocacy at Grid.ai, he collaborates with the Machine Learning Community to solve real-world problems with game-changing technologies that are then documented, open-sourced, and shared with the rest of the world.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/VHSVWY/", "id": 467, "guid": "4f268a96-0e6a-5ec5-93f7-4896afea4f64", "date": "2021-05-03T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-467-model-agnostic-interpretation-beyond-shap-and-lime", "title": "Model-Agnostic Interpretation - Beyond Shap and Lime", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "I\u2019ll discuss an interpretation framework that allows use of the features\u2019 distribution to understand the direction of the feature\u2019s impact. The concept is derived from ideas formulated in Pearl\u2019s analysis of causality in his book \u201cthe book of why\u201d.", "description": "The subject of interpretability becomes very important as models grow more and more complex but humans need to reason them. Since we don\u2019t want to be blocked by the model\u2019s algorithm (for example, if we want to bag several models), the community offers solutions that are based on alternatives analysis - local assessment, shuffling features, etc. \r\n\r\nIn this talk, I\u2019ll offer a framework that allows use of the features\u2019 distribution to understand the direction of the feature\u2019s impact, both on the entire sample\u2019s level and for specific observations. The inner workings of this method is highly intuitive and straightforward, and its concept is derived from ideas formulated in Judea Pearl\u2019s analysis of causality (check out \u201cthe book of why\u201d for more info). \r\n\r\nI\u2019ll present a specific use case of tabular data from Bluevine, and compare its performance to available solutions. I\u2019ll also mention directions for applying a similar method to additional fields.\r\n\r\n\r\nNathalie Hauser,\r\nData Science Manager @Bluevine", "recording_license": "", "do_not_record": false, "persons": [{"guid": "d7d5fb8a-9970-5627-8eb1-0d3110e5cdba", "id": 358, "code": "PU8LJM", "public_name": "Nathalie Hauser", "avatar": "https://cfp.pycon.org.il/media/nathalie_linkedin_picture.jpeg", "biography": "Manages the TLV Data Science Team @Bluevine, holds an MSc in Statistics from Tel-Aviv University. Interested in Machine Learning Models interpretation and its applications in the FinTech field.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/UVBJQH/", "id": 449, "guid": "b001d748-946b-5861-89ac-8adbe7800880", "date": "2021-05-03T11:00:00+03:00", "start": "11:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-449-fun-with-trees-get-to-the-root-of-song-classification", "title": "Fun With Trees! Get to the Root of Song Classification", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Join this session to hear about my journey with tree-based classifiers, while tackling the problem of classifying songs into different genres. Learn how XGBoost works and what makes it so popular.", "description": "Tree-based models are some of the most common machine learning models used today. It makes sense- the basic concept is easy to grasp and easy to work with.\r\nIn this talk, we will dive into the concepts behind the names Decision Trees and XGBoost, and discuss the advantages and disadvantages in comparison to other machine learning models. On the music side, we will discover how to extract features from songs and how to use them to differentiate between genres.\r\nThis talk is intended for anyone with basic familiarity with machine learning that would like to deepen their understanding in the subjects of tree-based models, classification, and how to apply machine learning to songs.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "de3543ad-0b0e-5b46-823b-1b9a13159746", "id": 351, "code": "ZKJQVE", "public_name": "Yama Anin Aminof", "avatar": "https://cfp.pycon.org.il/media/Yama_SquareProfile_small.jpg", "biography": "Yama Anin Aminof is a Data Scientist at MyPart, an Israeli startup in the music industry, developing algorithms and researching lyrical and musical song features. She is an activist both in the social world, fighting the violence against women and children, and in the technological world, giving tech talks and mentoring female developers through their first steps in the data science world. Yama has a B.Sc in Mathematics and Physics from Tel Aviv University where she also expresses her passion for music by playing the saxophone in the TAU Wind Band.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/JVMU7W/", "id": 385, "guid": "a40c9093-becf-54a3-a34d-e3164cc9f3c7", "date": "2021-05-03T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:45", "room": "PyData Track 1", "slug": "conference2021-385-causality-in-python", "title": "Causality in Python", "subtitle": "", "track": "PyData", "type": "Talk (long - limited number of slots)", "language": "en", "abstract": "Most data scientists are focused on predictive (aka supervised) projects, yet the real growth is usually in the estimation of action effects and optimizations of action policies. To this end, I will present causal inference and related packages.", "description": "There are three layers of analytics: descriptive (BI), predictive (supervised modeling), and prescriptive - the latter, the less-known one, focus on answering the most important business questions. For example, \"what was the effect of giving a discount\" ( or \"what should I do to create the desired effect\" - In this talk, we will first discuss what frameworks are used to answer these questions, namely causal inference, and reinforcement learning. Then we will deep dive into CI and discuss in causality crash 101 courses why is it important. Last but not least we will present existing causal-inference open-source packages and their limitations.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "06cff194-232c-5ec9-9ae3-a5bda2a6c874", "id": 308, "code": "9QXZYG", "public_name": "Hanan Shteingart", "avatar": "https://cfp.pycon.org.il/media/hanan_shteingart.jpeg", "biography": "Hanan is a data scientist at Vianai Systems where he develops methods to optimize business outcomes by using ML (causal inference, bandits, and RL). He is alumni of successful startups such as BioCatch.com and Gong.io where he showed proof of concept and built the data science teams from scratch.\r\nHe is also an alumnus of cooperates such as Microsoft where he was a senior data scientist. During his army service, Hanan was a signal processing and digital communication team leader (IDF).\r\nHanan holds a Ph.D. in computation neuroscience (Hebrew University) specialized in computational modeling of behavior and neural activity. He holds also a B.Sc. [cum laude] in Physics, B.Sc. [summa cum laude] and M.Sc. [cum laude] in Electrical Engineering (Tel Aviv University)", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/KYXN9N/", "id": 389, "guid": "f6eb886e-a93e-5764-8cc0-a37b4390243d", "date": "2021-05-03T13:30:00+03:00", "start": "13:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-389-natural-language-grounding-the-next-frontier", "title": "Natural language grounding - the next frontier", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "We all heard about huge transformers that cost millions of dollars to train, and achieve amazing results. But is there still room for the little guy, with a single GPU and a small budget to innovate in NLP ?\r\nWell, have you heard about grounding ?", "description": "We all heard about huge transformers (e.g. gpt3, dale, etc) that cost millions of dollars to train, and achieve amazing results. But is there still room for the little guy, with a single GPU and a small budget to innovate in NLP ?\r\n\r\nIn this talk we would describe the natural language grounding technique, that takes world context into account and achieved impressing results.\r\n\r\nWe would demonstrate how instruction parsing could be done more efficiently with a grounded representation.\r\nAnd we will will discuss the similarities with pragmatics (in linguistics).", "recording_license": "", "do_not_record": false, "persons": [{"guid": "e4d016a7-f820-5a27-8c8e-018bd02e0947", "id": 310, "code": "PLBCZU", "public_name": "Uri Goren", "avatar": "https://cfp.pycon.org.il/media/uri300.png", "biography": "Natural language expert, founder of argmax.ml", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/USMSXH/", "id": 436, "guid": "03f062f8-6aba-511e-9581-ca88123595d7", "date": "2021-05-03T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-436-prepare-for-the-unknown-adjust-your-model-to-label-distribution-shifts", "title": "Prepare for the Unknown - Adjust Your Model to Label Distribution Shifts", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Label distribution shift is a significant \u2018unknown' our models might encounter when facing the real world once they are deployed. In this talk I will provide practical approaches to assist our models to be more robust to such 'unknowns'.", "description": "If someone would have told me a year ago that we'd be wearing masks when walking outside and that my daughter's longest time off kindergarten won't be two weeks at August - I'd never believe it! But that's life - things change rapidly, and previously made assumptions might not remain valid.\r\nMany of us, Data Scientists, find ourselves working hard to train a model, deploy it to live environment and then realize the real world does not behave as we expected it. Our model crashes upon a reality that is much different than what it is familiar with. The root cause for this gap is the unexpected changes that impact our domain's population.\r\nIn this talk I will focus on a specific type of 'unknown' change - a shift in the label distribution. I will not only present how your model can be more agnostic to 'unknown' changes, but also\u00a0provide practical approaches you can apply to your model.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "ebf30c82-c5c3-5a71-8fa3-c6b427b48655", "id": 339, "code": "R9KNWW", "public_name": "Nofar Betzalel", "avatar": "https://cfp.pycon.org.il/media/linkedIn_profile_pic.jpeg", "biography": "Nofar is a Principal Data Scientist at PayPal. She develops fraud detection models that are being used in production to make real-time decisions that affect millions of PayPal users daily. She leverages PayPal\u2019s massive amounts of data, in the highly imbalanced fraud domain, to learn user behaviour and make sure PayPal is always ahead of its fraudsters. Nofar also co-hosts PayPal's internal and global Data Science Podcast.\r\nNofar Holds an M.Sc in Information Systems Engineering with a focus on Machine Learning from Ben-Gurion University, where she researched the field of Proactive Recommender Systems.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/A7SVUX/", "id": 468, "guid": "48d6371e-504c-5f05-acf3-df570c523138", "date": "2021-05-03T14:30:00+03:00", "start": "14:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-468-opening-the-black-box-an-interpretable-neural-network-architecture", "title": "Opening the black box \u2013 an interpretable neural network architecture", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Neural networks don\u2019t have to be black boxes, if you use creative designs and match the architecture to your specific needs, you can create a network as interpretable as linear regression, but without its linear constraints.", "description": "Many researchers use fully connected neural networks as a simple go-to model, without trying to match the architecture to the problem at hand. However, thanks to high-level open-source libraries such as pytorch, anyone can construct their own neural network architecture, to fit the requirements of a specific dataset. By creating a logical architecture, which models the generation process of our data, we achieve two goals:\r\n1.\tBetter accuracy on both train and test \u2013 since the model generalizes better.\r\n2.\tInterpretability \u2013 we can assign coefficients to different parts of the model, in a similar way to linear regression models, while allowing great flexibility in the actual model.\r\nInterpretability is important as it can help us understand the limitations and failings of our model, and engineer a better model, or collect more features, to improve on these areas.\r\nWe will examine a few examples of the limitations of simple fully connected neural networks, as well as other ML algorithms, and see how we can overcome these using architecture concepts anyone can implement in a few minutes using pytorch.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "f4f5da81-b734-50bd-847e-c4a34aeff661", "id": 245, "code": "YUMT7A", "public_name": "Oren Matar", "avatar": null, "biography": "A senior data scientist with interest in Bayesian methods, novel NN architectures and run-time optimization tricks in python. Specializing in time series forecasting particularly in the field of supply chain forecasting.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/KQNXWG/", "id": 421, "guid": "aaf9f4f5-70c4-580d-8383-9c7e5d844946", "date": "2021-05-03T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "PyData Track 1", "slug": "conference2021-421-automatic-curation-of-test-sets", "title": "Automatic Curation of Test sets", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Test sets are often designed to have a specific composition of cases, with constraints applied to each sub-population. Treating test-set curation as an optimization problem could save precious time and transition us towards a \"data as code\" paradigm.", "description": "Test set preparation is an essential part of any data science project. It is often the case that the test set is not just a random choice of samples, but rather a carefully designed population, with specific limits on the number of cases from each important sub-group. As the constraints get complicated, it often takes a while to get them all just-right.\u00a0 In this talk I'll show how to treat the test-set curation as a constraint-optimization problem that can be automatically solved using linear programming. I will demonstrate an open-source\u00a0python library, *curation-magic*, which elegantly does this for you, and argue that treating test-sets as an outcome of such optimization is a desired\u00a0transition towards a \"data as code\" paradigm.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "e9269b34-a131-5fb3-8891-eecb2f6f2380", "id": 328, "code": "3CSXQD", "public_name": "Jonathan Laserson", "avatar": null, "biography": "Dr. Jonathan Laserson is a machine learning expert and consultant, and the lead AI strategist of Zebra Medical Vision.  He did his PhD in the AI lab of Stanford University and his undergraduate studies at the Technion.  He built ML systems for Google and IBM research, and at Zebra Medical lead the development of clinical AI products from the idea stage to FDA-approval and production.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/DSB3HM/", "id": 498, "guid": "0898b074-a655-55e8-8145-9790d4dcd08e", "date": "2021-05-03T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:45", "room": "PyData Track 1", "slug": "conference2021-498--clippy-for-python-let-s-build-a-real-time-code-companion-by-hooking-over-any-function-", "title": "\"Clippy\" for Python - Let's build a real-time code companion by hooking over any function.", "subtitle": "", "track": "PyData", "type": "Talk (long - limited number of slots)", "language": "en", "abstract": "Python can do so much, including using python to change python behavior. In this talk, we will see how we can hook over any function in order to create an online \u201chelper\u201d in the style of Clippy for the old Office software. This can be useful for refe", "description": "This talk stems from the package I've built [dovpanda](https://github.com/dovpanda-dev/dovpanda). dovpanda is an overlay companion for working with pandas in an analysis environment - it hooks over any pandas method and suggests better ways code. We use `sys.modules` to replace the original function with a modified version while keeping track of the originals using `contextmanager`s. We then use `inspect` to understand what parameters were sent by the user so we can employ them to the companion's benefit. Using `ast` the companion can also understand information about runtime such as checking whether the function call was used in an assignment or part of a complex statement.\r\nPython is so wonderful, as it lets you control Python itself. This really feels like superpowers. In this talk I hope to scratch the surface of a few examples for such superpowers.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "4f63d41f-6d32-5d88-8de0-a18e1ed4b65e", "id": 62, "code": "KATEPN", "public_name": "Dean Langsam", "avatar": "https://cfp.pycon.org.il/media/avatars/Cato_k7br6Xd.jpeg", "biography": "I am a data scientist at SentinelOne, a rapidly-growing cybersecurity AI company. I am interested in data science, machine learning, deep learning, Python scientific programming, data visualizations, and Bayesian modeling. Specifically, I am a pandas enthusiast, and maintain dovpanda - a pandas companion package that helps data scientists with writing better, more concise pandas code.\r\n\r\nCheck out [dovpanda](http://bit.ly/dovpanda)  \r\nCheck out my [other talks](https://deanla.com/pages/my-talks.html)", "answers": []}], "links": [], "attachments": [], "answers": []}], "PyData Track 2": [{"url": "https://cfp.pycon.org.il/conference2021/talk/JQAHJP/", "id": 497, "guid": "3656d963-f847-56a7-b876-b1979e97a06f", "date": "2021-05-03T10:30:00+03:00", "start": "10:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-497-set-your-eda-on-autopilot", "title": "Set your EDA on Autopilot", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "This session will focus on one of the hottest topics of the past two years in the data science ecosystem - *Automated Exploratory Data Analysis*.", "description": "Recently Andrew Ng held a conference where his main claim was that we should be more data-centric in our research. He based his doctrine on various studies and examples that showed significant improvement in model performance once the researchers modified the data.\r\n\r\n\"If 80% of our work is data preparation, then ensuring data quality is the important work of a machine learning team.\"  \r\nAndrew Ng \r\n\r\nTo provide the model with strong foundations, we must explore and process the data professionally and meticulously. It can be a very long and exhausting process. To help you get through this part successfully, the new 'Automated EDA' field has emerged.\r\n\r\nIn the lecture, we will explore the field of automation in ML and how it corresponds with the variability of the projects. We will examine what can be automated in EDA and explore the latest feature of two powerful open-source tools - Pandas Profiling and SweetViz.\r\n\r\nThe audience will receive a link for the sides and to a Colab notebook with examples for:\r\n- Exporting EDA report using Pandas profiling and SweetViz.\r\n- Exporting EDA report that compares two data sets.\r\n- Exporting EDA report that compares two categories.\r\n- FAQ", "recording_license": "", "do_not_record": false, "persons": [{"guid": "72f6e4ff-35f8-57c7-8e06-2816cc2c28b2", "id": 376, "code": "FCPNGL", "public_name": "Nir Barazida", "avatar": null, "biography": "Nir Barazida, 30, Tel Aviv, Israel.<br/>\r\nData Scientist and Developer Relations at DAGsHub.<br/>\r\nFormer Data Scientist at Walty specializing in ML.<br/>\r\nPublic speaker for the past 6 years. <br/>\r\nSpoke at events organized by Sheldon Adelson, Haim Saban, Ron Dermer, etc.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/BGYVKQ/", "id": 466, "guid": "48c6b573-c0f3-587d-bcce-38abad7e79cc", "date": "2021-05-03T11:30:00+03:00", "start": "11:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-466-what-s-everyone-talking-about-discovering-topics-with-sentence-bert", "title": "What\u2019s Everyone Talking About? Discovering Topics with Sentence-BERT", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Topic Modeling\u2019s objective is to understand and extract the hidden topics from large volumes of text. Using a technique based on Sentence-BERT, we were able to perform the extraction of meaningful topics, and present some evaluation approaches.", "description": "Topic modeling is an information retrieval technique for discovering meaningful and interpretable topics in a collection of documents. It allows us to learn something about a set of documents that is too big to read. \r\n\r\nIn this talk we will cover how we leverage Sentence-BERT using the NLP Python framework sentence-transformers. It provides an easy method for extracting high quality sentence embeddings in a computationally efficient manner, which lays the basis for our topic modeling algorithm.\r\n\r\nWe will also be addressing the inherent difficulty of evaluating topic models by introducing measuring metrics and visualizations that aid the process of analyzing complex results.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "7b04d22a-e95d-5762-ab26-8bf3e43cb804", "id": 357, "code": "9FL9KJ", "public_name": "Stav Shemesh", "avatar": "https://cfp.pycon.org.il/media/slack_profile.png", "biography": "I hold a\u00a0B.Sc in computer science from the Open University, and currently working on my\u00a0M.Sc\u00a0in computer science at IDC Herzliya. For the past year and a half I am working at Amenity Analytics\u00a0as a data scientist, building models to solve a wide range of problems in NLP.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/NT3P8T/", "id": 402, "guid": "8b3ccf65-199b-5389-92d5-a17a1f089948", "date": "2021-05-03T14:00:00+03:00", "start": "14:00", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-402-string-comparison-in-real-life-challenges-and-various-ways-to-resolve-them", "title": "String Comparison In Real Life - Challenges and Various Ways to Resolve Them", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Text analysis in real life can often yield unsatisfactory results due to typos, alternate phrasing, abbreviations and more. In this talk, we'll cover practical and efficient string comparison methods, as well as tackle some commonly faced issues.", "description": "A common problem faced by data analysts, data scientists, and many developers who need to analyze and compare data, is that texts are often similar, but not quite identical to one another. \r\nThis can result from the existence of multiple ways to say the same thing, typos and abbreviations, common yet unindicative words (such as \"the\") and punctuation, that can all skew the results.\r\n\r\nDuring this talk, I will walk you through several methods to compare inexact texts, using a few different libraries, cover the usages as well as advantages & disadvantages of each method, and tackle some commonly faced issues.\r\n\r\nBy the end of the talk, you should have a good basis to start comparing texts efficiently and elegantly in your code.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "88a81ae3-7fd7-5ea7-adda-c2fcd3f4f4ae", "id": 319, "code": "VMTA8E", "public_name": "Naomi Kriger", "avatar": "https://cfp.pycon.org.il/media/profile_2.jpg", "biography": "I'm a Software Developer with previous experience in Risk & Data Analysis, working in a FinTech company.\r\nI'm also a tech blogger at naomikriger.medium.com and an 8200 alumna.\r\nI love programming, data, and everything in between. I also love foreign languages and chocolate.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/CS78M8/", "id": 428, "guid": "e30da793-76c7-5e09-91e2-85465e189839", "date": "2021-05-03T15:30:00+03:00", "start": "15:30", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-428-enabling-super-fast-ds-research-using-automl", "title": "Enabling Super Fast DS Research using AutoML", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "AutoML is a python driven tool we built in Outbrain Recommendations group. In this talk we'll share motivation for creating this tool, describe the general architecture and do a live short demo.", "description": "Recently Outbrain CTR prediction system was heavily reworked. In this talk, we will share our key enabler in this journey, a Python-based AutoML engine which allows data scientists to perform faster offline research iterations. This tool is a robust and highly parallel search engine built solely in Python. In this talk we'll share the motivation for building this tool, go through the general architecture and showcases some of its capabilities in a live demo.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "f2e2b8f8-4b5f-57a9-a425-b1bdc9056b38", "id": 332, "code": "UZ9VHA", "public_name": "Assaf Klein", "avatar": "https://cfp.pycon.org.il/media/assaf_face_KoH8lbd.jpg", "biography": "Software Engineer, Data Scientist, Cyclist. Yield Optimization Manager at Outbrain", "answers": []}, {"guid": "830c91ea-1773-5b02-b380-feb65c2bf4ba", "id": 333, "code": "KGRNKR", "public_name": "Hila Weisman-Zohar", "avatar": "https://cfp.pycon.org.il/media/pic.jpeg", "biography": "For the past decade Hila has been processing, analyzing and generating algorithms. After earning her masters (summa cum laude) at BIU NLP and publishing at elite academic venues such as EMNLP,  she began to research & develop algorithms that analyze call center calls as a senior researcher at NICE. During that time she published 4 US patents and academic posters at various venues. For the past 1.5 years she has been working as a senior data scientist at Outbrain where she works on large-scale super-fast algorithms for the native ads field. Hila also loves to teach and share her experience and has talked at various meetups and conferences.", "answers": []}], "links": [], "attachments": [], "answers": []}, {"url": "https://cfp.pycon.org.il/conference2021/talk/C9WXDC/", "id": 522, "guid": "811adbdf-105c-5e2c-98a4-76d908b15135", "date": "2021-05-03T16:00:00+03:00", "start": "16:00", "logo": null, "duration": "00:25", "room": "PyData Track 2", "slug": "conference2021-522-reprogramming-immunity-with-ai-and-single-cell-multiomics", "title": "Reprogramming immunity with AI and single-cell multiomics", "subtitle": "", "track": "PyData", "type": "Talk (regular)", "language": "en", "abstract": "Immunai has built one of the largest centralized immune single-cell data assets in the world and is using AI with it to expand the boundary of our understanding of core immune biology and how it translates to the clinical setting.", "description": "Our ability to interrogate and decipher the immune system has dramatically improved over the last 5 years with major advances in single-cell multiomic technology, both in the wet lab and in silico. Immunai has built one of the largest centralized immune single-cell data assets in the world and is using it to expand the boundary of our understanding of core immune biology and how it translates to the clinical setting. But this massive data asset offers a unique challenge in how to understand individual cell types, patients, diseases and treatments in the context of all the others. Immunai tackles this problem with cutting-edge artificial intelligence coupled tightly with our functional genomics platform, which together identify core biological mechanisms that enable us to develop the next generation of immune system therapeutics.", "recording_license": "", "do_not_record": false, "persons": [{"guid": "ad543f77-e456-54eb-80ed-5b5a7db374ec", "id": 397, "code": "EX99VY", "public_name": "Drausin Wulsin", "avatar": "https://cfp.pycon.org.il/media/headshot.DrausinWulsin.2019.jpg", "biography": "Drausin Wulsin is the ML lead at Immunai. He holds a PhD in Bioengineering from the University of Pennsylvania and has spend the last 8 years in industry building data, software, & ML teams in high-growth tech companies.", "answers": []}], "links": [], "attachments": [], "answers": []}]}}]}}}