As AI agents evolve from stateless prompt-response tools into stateful, long-running systems, context - not just compute - becomes the true bottleneck. Yet most architectures today treat context retrieval as an afterthought, bolting vector stores onto LLMs and hoping for the best. The result: brittle pipelines, runaway costs, and hallucinations born from memory mismanagement.
In this talk, we’ll explore a new approach: stream-native context engineering, powered by Apache Kafka and Apache Flink. By treating context as data in motion - continuously enriched, windowed, compacted, and served with low latency - we can build memory layers that scale with our agents and evolve with their understanding. We’ll dive into how stream processing primitives (state backends, RocksDB tuning, checkpoint strategies) can be repurposed for AI memory orchestration, and how to design architectures that separate ephemeral context from durable knowledge.
You’ll walk away with a practical blueprint for building context-aware AI systems - from ingestion to retrieval - and see why the next frontier of agentic intelligence won’t be decided in the model weights, but in the context pipeline that feeds them.
Speaker
Adi Polak
Director, Advocacy and Developer Experience Engineering @Confluent, Author of "Scaling Machine Learning with Spark" and "High Performance Spark 2nd Edition"
Adi is an experienced Software Engineer and people manager. She has worked with data and machine learning for operations and analytics for over a decade. As a data practitioner, she developed algorithms to solve real-world problems using machine learning techniques while leveraging expertise in distributed large-scale systems to build machine learning and data streaming pipelines. As a manager, Adi builds high-performance teams focused on trust, excellence, and ownership.
Adi has taught thousands of students how to scale machine learning systems and is the author of the successful book Scaling Machine Learning with Spark and High Performance Spark 2nd Edition.