About us
Welcome to our AI Meetup! We are a passionate community dedicated to building and learning about artificial intelligence. Whether you're an expert or just starting out, join us to share knowledge, collaborate on projects, and explore the fascinating world of AI together.
We'll be getting different events off the ground, both locally (SF) and virtually.
AI book club is going again in 2024, so if you have recommendations for us to read, let us know!
We'll AI cover topics such as Machine Learning (ML), Large Language Models (LLMs), Deep Learning, Data engineering, MLOps, Python, Computer Vision, Natural Language Processing (NLP), the Latest AI developments, and more!
Questions? Reach out to Sage Elliott on LinkedIn: https://www.linkedin.com/in/sageelliott/
Upcoming events
3

LLM fine-tuning with GRPO
·OnlineOnlineSupervised fine-tuning teaches a model to imitate examples. Reinforcement fine-tuning teaches it to optimize an objective you define, which is how recent open models picked up reasoning, reliable tool use, and consistent output formats. GRPO (Group Relative Policy Optimization) is the method behind a lot of that work, and it is simpler and cheaper to run than the PPO setups that came before it.
GRPO drops the separate value model that makes PPO expensive. For each prompt, it samples a group of completions, scores them with a reward function, and uses the group's own spread to estimate which responses were better than average. You train on prompts plus a reward function instead of a large hand-labeled dataset, and verifiable rewards (did the answer match, did the output parse, did it hit the format) get you a long way without preference annotation.
In this hands-on workshop, we'll fine-tune an open-weight LLM with GRPO using Hugging Face TRL, write reward functions that actually shape behavior, and deploy the result behind a simple UI. The whole pipeline runs on Flyte 2/Union, so data prep is cached, runs are reproducible and recoverable, and the same code scales from a laptop to a multi-node cluster without rewrites.
By the end, you'll have a working GRPO-trained model and a reusable RL pipeline you can point at your next task.What we'll cover
- A practical intro to GRPO
- Writing reward functions
- Sandboxes for safe code execution during training
- Fine-tuning an open-weight LLM with Hugging Face TRL's GRPOTrainer
- Orchestrating with Flyte 2: cached data prep, GPU-aware training, and durable, reproducible runs at any scale
- Deploying the model with a UI, with a path to scaled inference
What you'll leave with
- An LLM fine-tuned with GRPO against a reward function you wrote
- A reusable RL training and deployment pipeline you can adapt to your own task
- The knowledge to design reward functions and prompt sets for future GRPO projects
Who it's for
ML engineers and practitioners who want to move past prompt engineering and supervised fine-tuning, and shape model behavior with reinforcement learning. Whether you're prototyping at work, evaluating infrastructure for a production use case, or building a portfolio project, you'll leave with code you can keep extending.
Hosted by Sage Elliott, AI Engineer at Union.ai41 attendees
Building Code Mode Agents
·OnlineOnlineMost agents use tools by emitting one structured tool call at a time. The model picks a tool, waits for the result, picks the next, and every intermediate result flows back through its context. As the number of tools and steps grows, that gets expensive, slow, and brittle.
Code mode flips this. Instead of calling tools one at a time, the agent writes code that calls them as functions, runs that code in a sandbox, and returns only what matters back to the model. Why this works well in practice: models are trained on far more code than tool-call traces, so they compose operations more reliably in a language they already know. Loops, branching, filtering, and data transforms happen in the sandbox instead of as separate round-trips. Large intermediate results stay out of the context window, and tool definitions load on demand instead of all up front. The result is an agent that does more per step and spends fewer tokens doing it.
The obvious worry is letting a model write and run code. We'll handle that with Flyte's sandboxed orchestrator: the generated code can do pure control flow like loops, branching, and wiring tool calls together, but has no access to the filesystem, network, or OS. The actual tool or MCP calls get dispatched to isolated container tasks. You get the flexibility of letting a model write code without handing it the keys to the machine.
In this hands-on workshop, we'll build a code mode agent that writes and executes code to call tools, then deploy it behind a simple UI. The whole pipeline runs on Flyte 2/Union, so runs are durable and reproducible, steps are cached, and the same code scales from a laptop to a multi-node cluster without rewrites.
By the end, you'll have a working code mode agent and a reusable pattern you can point at your own tools.What we'll cover
- Why direct tool-calling breaks down at scale, and what code mode changes
- Running model-generated code safely in Flyte's sandboxed orchestrator, with heavy work dispatched to isolated tasks
- Orchestrating with Flyte 2: cached steps, durable runs, and execution that scales
- Deploying the agent with a UI, with a path to production
What you'll leave with
- A working code mode agent that writes and runs code to use tools
- A reusable pattern you can adapt to your own tools and MCP servers
- A clear sense of when code mode beats direct tool-calling, and when it does not
Who it's for
ML and software engineers building agents who want a more efficient, more capable alternative to chaining tool calls. Whether you're prototyping at work, evaluating agent infrastructure, or building a portfolio project, you'll leave with code you can keep extending.
Hosted by Sage Elliott, AI Engineer at Union.ai23 attendees
AI Book Club: RAG with Python Cookbook
·OnlineOnlineJuly's book is "RAG with Python Cookbook"!
This is a casual-style event. Not a structured presentation on topics. Sometimes, the discussion even drifts away from the chapters, but feel free to grab the mic to help steer it back.
Feel free to join the discussion even if you have not read the book chapters! :)
Want to discuss the contents during the reading week? Join the Flyte MLOps Slack group https://slack.flyte.org/
-------------------------------------------------
About the book:
Title: RAG with Python Cookbook
Authors: Dominik Polzer
Published: May 2026O'rielly platform: https://learning.oreilly.com/library/view/rag-with-python/9798341600553/
Chapters:
- 1. Getting Started with RAG
- 2. Foundation Models
- 3. Loading Data
- 4. Data Preparation
- 5. Embeddings
- 6. Vector Databases and Similarity Searches
- 7. Retrieval
- 8. Agentic RAG
- 9. Graph RAG
- 10. Evaluating RAG Systems
- 11. RAG Web Apps
####
Book Description
As businesses race to unlock the full potential of large language models (LLMs), a critical challenge has emerged: How do you connect these tools to real-time, external data to solve real-world problems? Retrieval-augmented generation (RAG) is the answer. By combining LLMs with information retrieval, RAG empowers you to build everything from intelligent chatbots to autonomous, task-solving agents.
Packed with over 70 practical recipes, this go-to guide tackles a wide range of GenAI applications through structured hands-on learning. Author Dominik Polzer provides the tools you need to design, implement, and optimize RAG systems for your unique use cases. Whether you're working with simple data retrieval or designing cutting-edge autonomous agents, this cookbook will help you stay ahead of the curve.- Learn core RAG components including embedding, retrieval, and generation techniques
- Understand advanced workflows like semantic-aware chunking and multi-query prompting
- Build custom solutions such as chatbots and autonomous agents for specific data challenges
- Continuously evaluate and optimize systems for accuracy, relevance, and performance
25 attendees
Past events
59


