Episodes
-
Dylan Patel, founder of SemiAnalysis, argues the biggest gains in AI don't come from faster chips, they come from software-hardware co-design. Optimizing the model, the kernels, and the silicon together turns a 2x here and a 2x there into 100x. He explains why DeepSeek's experts were shaped for Nvidia's Hopper (and why TPUs struggle to run it), why OpenAI's sparser models and Anthropic's denser ones pull them toward different hardware, and why the so-called CUDA moat was never really about CUDA. Dylan breaks down InferenceX, his living benchmark that runs the latest models on over $50M of donated hardware daily, tracking a roughly 60x annual drop in cost per unit of quality. He makes the case that inference will be a bigger market than oil, that the compute crunch persists because models expand the value of useful work faster than compute grows, and why Jensen Huang is bankrolling neoclouds to engineer a multipolar world.
Hosted by Shaun Maguire and Sonya Huang, Sequoia Capital -
Dan Biderman and Jessy Lin, co-founders of Engram, are building a neolab around memory and continual learning, which they call two sides of the same coin. Their contrarian premise: instead of stuffing ever-larger prompts into the context window or bolting on RAG, bake a team's knowledge directly into the model's weights, so it knows your company the way an employee of several years does.
The payoff: matching or beating frontier models while consuming up to 100x fewer tokens. Working with partners like Microsoft, Notion, and Harvey, the team draws on roots in computational neuroscience and state-space architectures to attack what they see as the real bottleneck in AI — not raw intelligence, but memory and continual learning. In contrast to the frontier labs' race toward one ever-bigger model and AGI, Dan and Jessy imagine a world where everyone has their own model — privately trained, always learning, and good at the things you actually care about. The real ChatGPT moment for memory, they argue, is the day your model feels like an intern that genuinely got smarter overnight.
Hosted by Sonya Huang and Shaun Maguire, Sequoia Capital -
Missing episodes?
-
The race to build superintelligence is producing models that keep getting better at objective problems, but not at behaving like actual people. Joon Sung Park, founder and CEO of Simile and creator of Stanford's "Smallville" generative agents study, argues that simulating human society requires a fundamentally different kind of model. He frames today's frontier models as the "CPU of intelligence"—rational, superhuman at problems with right answers—and Simile as creating the "GPU of intelligence," built to encode the diversity of people's values, preferences, and tastes. It simulated 1,000 Americans and predicted their behavior 85% as accurately as people reproduce their own answers. CVS uses it for concept testing; some customers simulate their own earnings calls. Joon's larger bet: a "CERN of human society" that could one day model bank runs, climate cooperation, or the early signals of a collapsing democracy.
Hosted by Sonya Huang, Sequoia Capital -
The entire startup ecosystem is racing to build agent harnesses. Logan Kilpatrick, who leads Google AI Studio and the Gemini API, argues that scramble has a roughly 12-month shelf life. Models will absorb the scaffolding and run it natively, so the edge moves elsewhere. Google's own bet runs in parallel: a single agent harness, born from the Windsurf team and now called Antigravity, has become the connective tissue across search, the Gemini app, Cloud, and AI Studio — the role Gemini-the-model used to play. Logan makes the case that coding already feels like narrow superintelligence, and that "jagged" vertical superintelligence (in math, finance, and science) will arrive well before AGI. He argues Google's real goal is maximizing outcomes for users, not eyeball time. He unpacks Omni, the single model built to replace multiple separate systems Google once trained for text, audio, music, image, and video. His throughline: AI is an accelerant for human ambition, not a substitute for it.
Hosted by Sonya Huang, Sequoia Capital -
Jensen Huang, founder and CEO of NVIDIA, makes the case that computing is undergoing its biggest shift in 60 years: from retrieval, where data centers store files we look up, to generation, where every word, image, and video is produced in real time and customized for whoever is asking. He explains why NVIDIA's AI factories are the dynamos of this era: machines that take in electrons and send out tokens of intelligence, just as Siemens' dynamo once turned motion into electricity. Jensen frames intelligence as the third force to "cocoon" the planet after electricity and the internet. He describes the five-layer cake of AI investment—energy, chips, infrastructure, models, applications—and dismantles the fear that AI will erase jobs, using radiology and software engineering to show how automation raised labor demand instead of killing it. His bottom line: you won't lose your job to AI, but you might lose it to someone who uses AI.
Hosted by Konstantine Buhler, Sequoia Capital -
Alfred Wahlforss, co-founder and CEO of Listen Labs, is building an AI agent that interviews your customers at a scale no focus group ever could—thousands of voice conversations at once, drawn from an audience of 30 million people. A year after launch, Listen serves hundreds of Fortune 100s to Startups including Microsoft, Google, NBC Universal, P&G, Anthropic, Cursor, and Cognition. Alfred explains the counterintuitive finding underneath it all: people are often more honest with an AI than a human interviewer, opening up to a non-judgmental entity that costs less and never makes them feel rushed. He walks through why interview transcripts—not credit card data or behavioral logs—turn out to be the richest fuel for predicting how customers will behave, how Listen back-tests its simulations to know which questions it can and can't answer, and why 80% of the company's engineering goes into building the right audience. As AGI makes building trivial, Alfred argues the scarce resource becomes knowing what to build. That's the loop Listen wants to own.
-
Cursor's Federico Cassano and Fireworks' Dmytro Dzhulgakov explain how they collaborated to build Composer as a specialized foundation model. The core insight: models have finite capacity in their weights, and allocating all those bits to the singular task of software engineering in Cursor frees the model to be both better at the task and far more efficient at inference. Rather than start from pre-training and work up, they took an unconventional top-down approach — mid-training and RL on top of an open-source base to get a useful model into users' hands fast, then specializing the model around real Cursor usage. With Fireworks providing distributed infrastructure, Composer delivers frontier-class coding performance with the speed of a much smaller model.
Hosted by Sonya Huang, Sequoia Capital -
Jake Stauch, founder and CEO of Serval, is building a ServiceNow for the AI era. His most contrarian bet is that the product should look like boring old enterprise software, but with unlimited intelligence. Serval's architecture splits work between two agents: an admin agent that uses code generation to spin up workflows from natural language, and a help desk agent that can only act through the tools admins explicitly approve. Jake explains why his team uses OpenAI models for end-user interaction and Anthropic models for code generation, why new model releases sometimes have to be rolled back when prompt tuning breaks, and why he's not worried the foundation labs will come downmarket. He also makes the case for "fewer, better" hiring as the only durable moat in a world where products may need to be rebuilt every six months.
Hosted by Pat Grady, Sequoia Capital -
Most music platforms assume you're a listener. On Suno, 90% of daily users make something. Founder and CEO Mikey Shulman explains why that flips the model: the act of creating IS the entertainment, with closer parallels to gaming and Claude Code than to Spotify. He breaks down the technical bets that got them here — modeling raw sound waves instead of encoding music theory, choosing autoregression over diffusion to prioritize full songs over crisp clips, and why music isn't a scale problem the way LLMs are. He also shares why partnering with Warner matters more than disrupting the record labels, what a truly interactive Coachella might look like, and why he thinks the digital music experience is finally due for its first real change in 25 years.
Hosted by Sonya Huang, Sequoia Capital -
Mati Staniszewski, co-founder and CEO of ElevenLabs, joins Sequoia partner Andrew Reed at AI Ascent 2026 to talk about how a four-year-old company built a frontier audio AI business with just over 400 people and over $400M in revenue. He explains why audio was overlooked in 2022 when the rest of AI was chasing text and images, why ElevenLabs chose to monetize from day one rather than raise indefinitely, and why he believes voice will be the primary interface for agents, robots, and the next generation of computing. Also: why emotional intelligence is the next frontier in voice, and what happens when one voice agent realizes it's talking to another.
-
Boris Cherny, creator of Claude Code at Anthropic, joins Sequoia partner Lauren Reeder at AI Ascent 2026 to talk about where coding goes from here. He explains why he hasn't written a line of code in 2026, why he now ships dozens of PRs a day from his phone, and why he believes coding is effectively solved — at least for the code he writes. Also: why loops are the future, why he thinks Claude Code itself may be 100 lines of code a year from now, and why the invention of the printing press is the right analogy for what's about to happen to software.
-
Dmitri Dolgov, co-CEO of Waymo, joins Sequoia partner Konstantine Buhler at AI Ascent 2026 to talk about the 20-year arc from the DARPA Grand Challenge to fully autonomous service in eleven cities and counting. He explains how Waymo persisted through every AV hype cycle by treating safety as the non-negotiable foundation, why exponential scaling is finally here (10 of Waymo's 20 million autonomous rides have happened in the last seven months), and how the Waymo Foundation Model — a multimodal world action model that powers the driver, the simulator, and the critic — actually works under the hood. Also: why Waymo is now 13x safer than human drivers, and the moment a Waymo detected a pedestrian behind a city bus by reading the LiDAR returns of their feet.
-
Greg Brockman, co-founder and president of OpenAI, joins Sequoia partner Alfred Lin at AI Ascent 2026 for a conversation that spans the full OpenAI stack. He explains why the company will never have enough compute, why he believes we're 80% of the way to AGI, and why the agentic coding tools that wrote 20% of your code last December are now writing 80% of it. Also: why human attention is becoming the scarcest resource in AI-augmented work, and what it might be like to one day run an organization of 100,000 agents.
-
Demis Hassabis, co-founder and CEO of Google DeepMind and 2024 Nobel laureate in chemistry for AlphaFold, joins Sequoia partner Konstantine Buhler at AI Ascent 2026 for a wide-ranging conversation about the path to AGI and what comes after. He explains why he believes AGI is achievable by 2030, why drug discovery could collapse from ten years to days, and why we should think of information, not matter or energy, as the most fundamental substance in the universe. Also: what Einstein would tell us about the limits of today's models, and why the next year or two will be critical for humanity.
-
Andrej Karpathy (co-founder of OpenAI, former head of AI at Tesla, and now founder of Eureka Labs) talks with Sequoia partner Stephanie Zhan at AI Ascent 2026 about what's changed in the year since he coined "vibe coding." He explains why he's never felt more behind as a programmer, why agentic engineering is the more serious discipline taking shape on top of vibe coding, and why we should think of LLMs not as animals but as ghosts: jagged, statistical, summoned entities that require a new kind of taste and judgment to direct. He also touches on Software 3.0, the limits of verifiability, and why you can outsource your thinking but never your understanding.
-
James Cadwallader, co-founder and CEO of Profound, makes the case that we are living through the biggest platform shift in marketing history. The front door of the internet hasn't changed, but the visitor walking through it has. Where consumers once clicked blue links, AI agents now crawl the web on their behalf, synthesizing answers and steering purchase decisions at scale.
James explains why Gemini, ChatGPT, and Claude all recommend brands differently, why mapping AI visibility onto traditional SEO is the wrong instinct, and why the real imperative is to equip a superintelligent agent with original insight it couldn't find anywhere else. He also digs into the dead internet theory – the possibility that human browsing could largely cease within three years – how AI advertising may become the most effective form the world has ever seen, and why agent-led marketing isn't just automation of the old work, but an entirely new capability.
Hosted by Sonya Huang, Sequoia Capital -
Jason Kelly founded Ginkgo Bioworks in 2008 with a simple but radical idea: DNA is code, and cells are programmable. Sixteen years later, AI is finally making that vision real in ways that could reshape science itself. Jason describes a landmark collaboration with OpenAI in which a reasoning model with access to a robotic lab beat the state of the art in biochemistry by 40% - not by being smarter than scientists, but by running experiments 24 hours a day and sharing data across a hundred parallel hypotheses simultaneously. He argues that the biggest inefficiency in science isn't intelligence, it's manual labor. Once AI helps scale research, the cost of discovery collapses and breakthroughs follow, with profound implications for biopharma, national competitiveness, and human health.
Hosted by Sonya Huang and Pat Grady, Sequoia Capital -
Philip Johnston, founder and CEO of Starcloud, explains why space will become the primary location for AI compute infrastructure within the next decade. After witnessing SpaceX's massive manufacturing scale at Starbase, Philip realized that declining launch costs would make space-based data centers cheaper than terrestrial ones. He breaks down the physics of heat dissipation in vacuum, the economics of solar power without atmosphere, and why the marginal cost of space infrastructure decreases while Earth-based costs increase. Philip previews a future where close to a trillion dollars per year in CapEx flows to space compute. And, yes, we get his take on aliens.
Hosted by: Sonya Huang and Pat Grady, Sequoia Capital. -
Nominal’s cofounders (Cameron McCord, Jason Hoch and Bryce Strauss) realized that the new age of reindustrialization requires a new approach to hardware engineering and testing that’s closer to how software is developed. They founded Nominal with the insight that while SpaceX, Tesla, and Anduril built proprietary internal platforms for hardware testing, the thousands of new hardware entrants can't afford to replicate that work.
Nominal serves as the system of record for hardware testing, helping companies move from PDF-based workflows to modern data infrastructure that catalogs telemetry from sensors producing millions of data points per second.
The platform enables engineers to author validation logic that follows hardware systems from initial testing through manufacturing and field deployment. We discuss their belief that all hardware companies will become physical AI companies, and why they think Nominal's role as the verification layer will be critical - because unlike a video game, physical products require rigorous validation before they enter the real world.
Hosted by: Alfred Lin and Sonya Huang, Sequoia Capital -
Will Brown and Johannes Hagemann of Prime Intellect discuss the shift from static prompting to "environment-based" AI development, and their Environments Hub, a platform designed to democratize frontier-level training.
The conversation highlights a major shift: AI progress is moving toward Recursive Language Models that manage their own context and agentic RL that scales through trial and error. Will and Johannes describe their vision for the future in which every company will become an AI research lab. By leveraging institutional knowledge as training data, businesses can build models with decades of experience that far outperform generic, off-the-shelf systems.Hosted by Sonya Huang, Sequoia Capital - Show more