Language: English
09-09, 11:00–11:20 (Asia/Jerusalem), Hall 7
Learn how to fine-tune small language models efficiently using modern Python tools like Axolotl. A practical, GPU- conscious guide to customizing LLMs with QLoRA, chat templates, dataset chunking, and cloud-friendly workflows.
This 20-minute light talk walks through a real-world fine-tuning pipeline built entirely in Python. You'll learn how to structure and run scalable fine-tuning jobs, even on limited hardware like Colab or cloud GPU services like RunPod. Topics include:
• Why full fine-tuning is dead: a quick look at parameter-efficient approaches (like QLoRA)
• How Axolotl simplifies model loading, LoRA injection, and dataset prep
• Managing training across large datasets using chunked fine-tuning
• Moving beyond Colab: when and how to scale to multi-GPU training with DeepSpeed
• Performing inference on your fine-tuned model with minimal setup
No prior ML experience needed — just some Python familiarity and curiosity about LLMs.
Basic
Target audience –Developers, Data Scientists
With 20+ years in data, ML, and GenAI, I blend academic research with real-world innovation. After a PhD focused on early GenAI work, I led GenAI initiatives at Datomize and now build tailored Small Language Models at Datawizz. I'm a founder passionate about AI for good.