Language: עברית
09-09, 15:00–15:20 (Asia/Jerusalem), Hall 7
This talk will show how to set up a private LLM + RAG system using Python in an "air-gaped" environment. We’ll cover choosing efficient open-source models, setting up local vector databases, and optimizing retrieval in resource-limited environments.
When our team wanted to use LLMs with RAG, we quickly hit a wall—sending sensitive data to the cloud wasn’t an option. Whether it's business secrets, medical records, or legal documents, some data simply can’t leave a secure network. So, we had to build our own private AI pipeline.
In this talk, I’ll share how we set up a fully private LLM + RAG system using Python. We’ll dive into choosing efficient open-source models, setting up local vector databases, and making retrieval work in a resource-limited environment. Along the way, we’ll discuss trade-offs, optimizations, and how to squeeze the most out of smaller models without sacrificing too much intelligence.
By the end, you’ll have a clear road map for building your own secure AI pipeline—no cloud required!
רמת ביניים
Target audience –Developers, Data Scientists, DevOps
Yaacov is a software engineer at Red Hat, he is a long time contributor to free software projects, volunteer at the Nitzanim project, and like cats.