Anthropic: Our AI just created a tool that can ‘automate all white collar work’, Me:

Episodes

Claude Fable 5 - Full 319 page Breakdown
10 juin· AI Explained Official Podcast
Fable 5 is out - and it’s good, very good. But beyond the splashy demos, I want to bring you the 20+ nuggets from the 319 page system card, which I read in full, all day, plus benchmarks you may not have noticed.

https://assemblyai.com/aiexplained

Plus two worrying trends inside the ‘mind’ of Claude, how OpenAI counter, and the transformer inventor’s warning.

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:06 - Blocks + Better Models
02:42 - Fable 5 Upgrade over Mythos Preview
04:49 - ML Acceleration Bombshell
07:11 - No RSI yet
07:41 - Bio-capable
14:51 - Creative Writing … no
17:23 - Does need bug-checks
18:57 - OpenAI Response
19:23 - Benchmark Bonanza
28:06 - Chain of Thought worrying trend

Fable 5 Release: https://www.anthropic.com/news/claude-fable-5-mythos-5

System Card: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf

Intelligence Explosion: https://www.patreon.com/posts/anthropic-charts-160231656

Annotated: https://x.com/Miles_Brundage/status/2064500190523113816/photo/1

OpenAI Counter: https://x.com/thsottiaux/status/2064572118264913923
https://x.com/thsottiaux/status/2043177597434306699

Double Lifespan: https://darioamodei.com/essay/machines-of-loving-grace

AutomationBench: https://zapier.com/benchmarks
Vending Bench: https://x.com/andonlabs/status/2064429817530085804
CritPt: https://critpt.com/
Riemann Bench: https://surgehq.ai/leaderboards/riemann-bench
GDPVal: https://artificialanalysis.ai/evaluations/gdpval-aa
BluePrint Bench 2: https://andonlabs.com/evals/blueprint-bench-2
MCP Atlas: https://labs.scale.com/leaderboard/mcp_atlas
FutureSim: https://x.com/nikhilchandak29/status/2064676801440358774

Roon Stun Lock: https://x.com/tszzl/status/2064454617568874669

Noam Brown Inference Ceiling: https://x.com/polynoamial/status/2064210146558136827

Isochronic Chart: https://isochronic-passage-chart.netlify.app/#nyc
Rose Tavern: https://claude.ai/public/artifacts/2295bebe-77e6-43e2-ae94-0fe49e9a776b
Redwall Game: https://redwall-mossflower.surge.sh/

Risk Report: https://www-cdn.anthropic.com/097c63b5fe7dd8b14866e1f15bb1910ec713658a.pdf

Transformer Inventor Warning: https://x.com/tszzl/status/2064563986914554125

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
New Claude - 244 page breakdown
29 mai· AI Explained Official Podcast
The ‘best’ generally available AI model just dropped, but there is plenty I bet you missed about what it is, how it performs, and what the release tells us. 15 highlights from the 244 page system card, plus private testing, leader interview and more.

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:49 - Mythos in Weeks
01:49 - Adaptive not necessary
02:26 - Honesty?
04:37 - Flagging Uncertainty
04:57 - Benchmarks
08:54 - Mythos will be even better
10:30 - Business skillz
11:15 - Model Welfare
12:16 - Cyber Comparable
13:10 - Misalignment Concerns
16:22 - Meta Inabilities
17:58 - Code flagging
18:34 - Go to sleep
18:50 - Fast Mode
20:21 - Dynamic Workflows

Opus 4.8 Paper: https://cdn.sanity.io/files/4zrzovbb/website/c886650a2e96fc0925c805a1a7ca77314ccbf4a6.pdf

Release: https://www.anthropic.com/news/claude-opus-4-8

Chips: https://www.theinformation.com/articles/anthropic-talks-use-microsofts-ai-chips?rc=sy0ihq
https://www.anthropic.com/news/expanding-our-use-of-google-cloud-tpus-and-services
https://www.anthropic.com/news/higher-limits-spacex

Patreon Vid: https://www.patreon.com/posts/re-up-anthropics-159289449

GDPVal: https://artificialanalysis.ai/evaluations/omniscience
https://arxiv.org/abs/2510.04374

Amodei Technical Debt: https://www.youtube.com/watch?v=7xco5Qd2Oo8

Dynamic Workflows: https://x.com/ClaudeDevs/status/2060044853279617150
https://x.com/_catwu/status/2060054180379689074/photo/1
https://claude.com/blog/introducing-dynamic-workflows-in-claude-code

https://simple-bench.com/

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Episodes manquant?

Cliquez ici pour raffraichir la page manuellement.
Two Rival Bets on AGI: Google I/O Highlights
20 mai· AI Explained Official Podcast
The biggest Google AI push of the year, but what is the bigger story? Why is Google pursuing a different fork in the road than OpenAI or Anthropic? What does Gemini 3.5 Flash mean for the near-term future of AI?

https://assemblyai.com/aiexplained

Plus the highlights from a provocative new paper on AI, 8 key moments you may have missed, and the signal from 5+ hours of AI lab interviews.

Check out my free to use app, code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:38 - Vibes and Google Goal
02:18 - Omni, again?
06:57 - Taking the same road
07:44 - Gemini 3 Flash
12:37 - Pitching on Cost?
13:55 - Agentic Task Search
14:30 - 1-shot OS but jagged, negation paper
20:02 - The Karpathy Moonshot

Mostafa Deghani Interview: https://www.youtube.com/watch?v=Bo19sXssYXI

Negation Neglect Paper: https://arxiv.org/pdf/2605.13829

Gemini 3.5 Flash Headline Scores: https://deepmind.google/models/model-cards/gemini-3-5-flash/

Sors original AGI Path: https://www.theguardian.com/commentisfree/2024/feb/24/openai-video-generation-tool-sora-babies-ai-artificial-intelligence

Hassabis Helped Set-up Anthropic: https://archive.fo/20260519070857/https://www.ft.com/content/8f2a529e-7a1b-4d8e-95be-338d0c4c98f5

Intelligence to Output Speed: https://artificialanalysis.ai/models?intelligence-comparison=intelligence-vs-output-speed#intelligence

VibeCodeBench + Finance Agent: https://www.vals.ai/home

OpenAI Needs Ads: https://archive.ph/20260409123153/https://www.reuters.com/business/media-telecom/openai-projects-25-billion-ad-revenue-this-year-100-billion-by-2030-axios-2026-04-09/

Anthropic Core Views: https://www.anthropic.com/news/core-views-on-ai-safety

Karpathy Move: https://x.com/karpathy/status/2056753169888334312
https://www.axios.com/2026/05/19/anthropic-openai-karpathy-andrej-claude

Recursive Self-Improvement: https://www.patreon.com/posts/ineffably-smart-156866417

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies
24 avril· AI Explained Official Podcast
GPT 5.5 full analysis, plus DeepSeek V4 paper highlights, comparisons with Mythos, a vibe-coded game w/ GPT Image 2, and 50 data-points you wouldn’t get from just reading the headlines.
Chapters:
01:11 - GPT 5.5 Comparison
06:04 - Mythos Marketing
11:50 - Recursive Self-Improvement?
14:11 - Deepseek V4
18:03 - VibeCode Experiment Extravaganza
21:44 - The Scarce Compute Era

https://80000hours.org/aiexplained

OpenAI Benchmarks: https://openai.com/index/introducing-gpt-5-5/
5.5 System Card: https://deploymentsafety.openai.com/gpt-5-5/gpt-5-5.pdf
Direct Comparison: https://pbs.twimg.com/media/HGnNm5GWEAAJ1Ob?format=jpg&name=4096x4096
DeepSeek Paper: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
SWE Bench Pro - benchmark of choice? https://x.com/ChowdhuryNeil/status/2047416077622395025

AA Omniscience: https://artificialanalysis.ai/evaluations/omniscience

Vending Bench: https://x.com/andonlabs/status/2047377260412649967
Opus 4.7 System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf
Sam Altman Drunk Phase: https://x.com/sama/with_replies
Noam Brown: https://x.com/polynoamial/status/2047387675762802998
DeepSeek Compute Crunch: https://www.bloomberg.com/news/articles/2026-04-24/deepseek-unveils-newest-flagship-a-year-after-ai-breakthrough?srnd=phx-ai
Spreadsheet Bench: https://x.com/nicochristie/status/2047476237464211721
Pattern Recognition: https://arcprize.org/leaderboard
Leader Interviews:
Core Memory: https://www.youtube.com/watch?v=NCKQL0op30E
Knowledge Podcast: https://www.youtube.com/watch?v=6JoUcQ1qmAc
Big Tech Round 1: https://www.youtube.com/watch?v=J6vYvk7R190&t=1116s
Big Tech Round 2: https://www.youtube.com/watch?v=YnoQ8RJbALw&t=8s
Claude Code Limitations: https://x.com/TheAmolAvasare/status/2046724659039932830
ChatGPT 5.4 for Clinicians: https://openai.com/index/making-chatgpt-better-for-clinicians/
Image Arena: https://x.com/arena/status/2046670703311884548
VibeCode Bench: https://www.vals.ai/benchmarks/vibe-code
5.5-made Game +Seedance 2.0: https://rosemere-quest.pages.dev/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude Opus 4.7 - A New Frontier, in Performance … and Drama
17 avril· AI Explained Official Podcast
Claude Opus 4.7 just dropped, but behind every headline lies a deeper story. From a bonanza of benchmarks, to seeing the fruits of one of the biggest mega-projects in US history, to sneaky Mythos disclaimers, to Anthropic admitting compute restraints and, forcing lower capability of Opus 4.7. Where the new model falls behind Gemini but ahead of GPT 5.4, plus why some users are furious at Anthropic. Ending with a 9-year animus, that still affects AI today…

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:58 - Benchmarks
05:21 - Market Share + Compute Problems
08:12 - Mythos Exclusives
12:56 - User Frustration + Claude Code Updates
14:03 - Brockman Amodei Rivalry
17:40 - OpenAI vs Anthropic Approach to Code

Claude 4.7 Opus Release Notes: https://www.anthropic.com/news/claude-opus-4-7
vs Mythos: https://pbs.twimg.com/media/HGCGugrXUAAKcHp?format=jpg&name=medium

232-page System Card: https://cdn.sanity.io/files/4zrzovbb/website/037f06850df7fbe871e206dad004c3db5fd50340.pdf

ARC-AGI 2: https://x.com/arcprize/status/2044834615417053305/photo/1

ParseBench: https://x.com/jerryjliu0/status/2044902620746363016/photo/1

GDPVal: https://artificialanalysis.ai/evaluations/gdpval-aa

Vidoc Security Replication: https://blog.vidocsecurity.com/blog/we-reproduced-anthropics-mythos-findings-with-public-models

Boris Cherny Settings: https://x.com/Hesamation/status/2043016923961577516/photo/2

User Frustration: https://x.com/RileyRalmuto/status/2044836116189069660

VibeCode Bench: https://x.com/ValsAI/status/2044791415524471099/photo/1

Verge Memo: https://www.theverge.com/ai-artificial-intelligence/911118/openai-memo-cro-ai-competition-anthropic

5.4 Cyber: https://openai.com/index/scaling-trusted-access-for-cyber-defense/

Data Centers in Absolute $: https://x.com/finmoorhouse/status/2044933442236776794/photo/1

…in % of GDP: https://pbs.twimg.com/media/HGEN8FGWQAAN7Np?format=jpg&name=4096x4096

WSJ Exclusive: https://www.wsj.com/tech/ai/the-decadelong-feud-shaping-the-future-of-ai-7075acde

Brockman Interview: https://www.youtube.com/watch?v=J6vYvk7R190

$1T Valuation: https://x.com/StefanFSchubert/status/2045039686997967082

Emotions: https://www.patreon.com/c/aiexplained/posts

https://lmcouncil.ai/benchmarks

Non-hype Newsletter: https://signaltonoise.beehiiv.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude Mythos: Highlights from 244-page Release
8 avril· AI Explained Official Podcast
The model, the mythos, the legend. We have a new best AI model, but not all of us. How good is it, what does it’s new offensive capabilities mean? Why does it’s 244 page report card remind me of Her, and why did the creator of Claude Code call it ‘terrifying’. 30+ highlights sourced by reading the paper in full, old-school, no AI summary.

https://80000hours.org/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:56 - Internal Release + Availability
02:37 - General Capabilities
05:12 - Self-improvement?
06:15 - ‘Terrifying’ Landscape
11:07 - Safety Decision
13:22 - Coding
14:49 - Alignment, Awareness
19:52 - GUI for Agents/Claws + Hallucinations
21:34 - …Emotions?
25:29 - Her connection

244-page System Card: https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf

Project Glasswing: https://www.anthropic.com/glasswing
Zero-Day Details: https://red.anthropic.com/2026/mythos-preview/

Mythos ‘terrifying’: https://x.com/bcherny/status/2041605852382351666

New Yorker Altman/Amodei: https://archive.fo/20260406100412/https://www.newyorker.com/magazine/2026/04/13/sam-altman-may-control-our-future-can-he-be-trusted

Alignment Risk Update: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de43218158e5f25c.pdf

In a Park: https://x.com/sleepinyourhat/status/2041584808514744742

“Uhm” - https://x.com/thsottiaux/status/2041749947385815109

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3
26 mars· AI Explained Official Podcast
First look at exclusive reports about OpenAI's new Spud model, and the model Anthropic think will stir governments to urgency, all in the context of the newly-launched ARC-AGI-3. What does the extreme difficulty of that benchmarks, and its quirky scoring metrics, mean for AI in 2026?

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:55 - OpenAI Side Quests
01:58 - Claude New Model Coming + Universal Equity?
03:13 - ARC-AGI 3
05:00 - Intentional or Unintentional Gaming?
07:11 - But is it AGI Harbinger? No Harness
09:41 - Not the First
12:32 - Automated Researcher
15:00 - Claw Caveat

Spud: https://www.theinformation.com/articles/openai-ceo-shifts-responsibilities-preps-spud-ai-model?utm_campaign=Editorial&utm_content=Article&utm_medium=organic_social&utm_source=bluesky%2Cfacebook%2Clinkedin%2Cthreads%2Ctwitter&rc=sy0ihq

FT: OpenAI Special Model: https://www.ft.com/content/de9bf0af-b241-424f-8229-5870b1c0d93d?syn-25a6b1a6=1

Jensen Huang: https://www.forbes.com/sites/antoniopequenoiv/2026/03/23/nvidias-jensen-huang-says-he-thinks-weve-achieved-agi/

Axios Article: https://archive.fo/20260326100140/https://www.axios.com/2026/03/26/anthropic-pentagon-ai-deal#selection-827.0-829.257

https://arcprize.org/arc-agi/3

ARC AGI 3 Paper: https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf

NetHack Leaderboard: https://balrogai.com/
Paper: https://ai.meta.com/research/publications/the-nethack-learning-environment/
https://x.com/_rockt/status/2036864121585438995

Claw Shells: https://x.com/DrJimFan/status/2036494601750716711

OpenAI Automated Researcher: https://www.technologyreview.com/2026/03/20/1134438/openai-is-throwing-everything-into-building-a-fully-automated-researcher/

Patreon Post: https://www.patreon.com/c/aiexplained/posts

Eng Jobs: https://x.com/lennysan/status/2036535460726767793

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
What the New ChatGPT 5.4 Means for the World
6 mars· AI Explained Official Podcast
Just 48 hours after releasing GPT 5.3 Instant, OpenAI have released GPT 5.4 Thinking, so either their is an imminent singularity or perhaps we are being distracted from other news. This video will give 9 crucial bits of context, not just on the GPT 5.4 drop but on the background to the meltdown between the Pentagon and Anthropic. What does this say about the state of AI progress, your job, and what is next.

Check out my fast-growing (!) app, free to use, and code INSIDER15 for 15% off paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:06: GPT 5.4 Breakdown
05:06 - Closing the Loop
06:35 - Spiky Performance
10:31 - Advice
11:32 - Less Encouraging Developments - Fired Like Dogs
17:45 - But Used in Iran

GPT 5.4: https://openai.com/index/introducing-gpt-5-4/

Hallucinations: https://artificialanalysis.ai/evaluations/omniscience
Investment Banking Bench: https://x.com/bradlightcap/status/2029684672343728452
Move 37: https://x.com/nasqret/status/2029628846518010099
System Card: https://deploymentsafety.openai.com/gpt-5-4-thinking/gpt-5-4-thinking.pdf

Prediction Market Scandal: https://www.wired.com/story/openai-fires-employee-insider-trading-polymarket-kalshi/

GPT 5.3 Instant: https://openai.com/index/gpt-5-3-instant/

GDPVal: https://openai.com/index/gdpval/

Claude in Iran: https://www.washingtonpost.com/technology/2026/03/04/anthropic-ai-iran-campaign

‘Like Dogs’: https://x.com/AndrewCurran_/status/2029605783311470679

Altman leak: https://www.cnbc.com/2026/03/03/sam-altman-tells-openai-staff-operational-decisions-up-to-government.html

Original 2024 Switch: https://archive.fo/20240116172526/https://www.bloomberg.com/news/articles/2024-01-16/openai-working-with-us-military-on-cybersecurity-tools-for-veterans#selection-6173.83-6173.226

Amodei Original Memo: https://www.theinformation.com/articles/read-anthropic-ceos-memo-attacking-openais-mendacious-pentagon-announcement?rc=sy0ihq
Anthropic Apology: https://www.anthropic.com/news/where-stand-department-war
OpenAI Employee Reaction: https://x.com/tszzl/status/2029334980481212820

DoD Suppler Risk: https://www.cnbc.com/amp/2026/03/05/anthropic-pentagon-ai-claude-iran.html
Atlantic Exclusive: https://archive.fo/20260301152646/https://www.theatlantic.com/technology/2026/03/inside-anthropics-killer-robot-dispute-with-the-pentagon/686200/#selection-941.61-941.212
No Negotiation: https://x.com/USWREMichael/status/2029754965778907493

$20B Doubling: https://archive.ph/20260304111124/https://www.bloomberg.com/news/articles/2026-03-03/anthropic-nears-20-billion-revenue-run-rate-amid-pentagon-feud

March 2022 Interview: https://www.youtube.com/watch?v=uAA6PZkek4A

https://lmcouncil.ai/

Non-hype Newsletter: https://signaltonoise.beehiiv.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Deadline Day for Autonomous AI Weapons & Mass Surveillance
27 févr.· AI Explained Official Podcast
Will Anthropic be forced to make a version of Claude for war? And does a new paper expose the risks of Claude agents, in both OpenClaw and the field of war? Plus, 5 more twists in the story of the Pentagon versus Anthropic + some AI lab employees, and a petition that could change everything, or nothing...

Check out my fast-growing (!) app, free to use, and code INSIDER15 for paid tiers: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:44 - Deadline Day + Petition
02:42 - Twist 1: Existing Deal
03:26 - Twist 2: Existing Policy
04:21 - Twist 3: Twin Threats
05:54 - Twist 4: Interesting Objections
11:32 - Twist 5: Anthropic’s Dropped Policy

Dario Statement: https://www.anthropic.com/news/statement-department-of-war

Google/OpenAI Petition: https://notdivided.org/

Axios on Amodei Rejection: https://www.axios.com/2026/02/26/anthropic-rejects-pentagon-ai-terms

FT on US Threat: https://www.ft.com/content/11d27612-d6c5-4cf7-94dd-f65603549b7f

Politico on Latest: https://archive.ph/20260227013117/https://www.politico.com/news/2026/02/26/incoherent-hegseths-anthropic-ultimatum-confounds-ai-policymakers-00800135

The Verge on Current Deal: https://www.theverge.com/ai-artificial-intelligence/883456/anthropic-pentagon-department-of-defense-negotiations

Anthropic RSP change: https://www.anthropic.com/news/responsible-scaling-policy-v3

Time Magazine on RSP: https://time.com/7380854/exclusive-anthropic-drops-flagship-safety-pledge/

Agent of Chaos Paper: https://x.com/NatalieShapira/status/2026062499599319526

AI Agent Reliability Paper: https://arxiv.org/pdf/2602.16666

My Patreon Video: https://www.patreon.com/posts/real-mystery-ai-151647211

Patreon Documentary: https://www.patreon.com/posts/our-new-age-of-133960279

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI
20 févr.· AI Explained Official Podcast
Do we have a new best AI model, or do we have the downfall of benchmarks in general, as a way of capturing machine intelligence? Full breakdown of Gemini 3.1 Pro, guest-starring the new Sonnet 4.6, plus analysis from 7 papers/posts that will give you much needed context. Oh, and a new record on Simple Bench!

https://epoch.ai/ai-explained-datacenters

Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:30 - Post-training Dominance
04:00 - ARC-AGI 2 Caveat
05:54 - Simple Bench Record
08:22 - Hallucination Caveat
10:05 - Model Card
11:12 - Exponential Coming
12:20 - Amodei on Generalizing
15:10 - One True Benchmark?
17:02 - Other Metrics…

Gemini 3.1 Model Card: https://storage.googleapis.com/deepmind-media/Model-Cards/Gemini-3-1-Pro-Model-Card.pdf

Release: https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/

Where are Agents deployed?: https://www.anthropic.com/research/measuring-agent-autonomy

Newsletter Post: https://signaltonoise.beehiiv.com/p/4-ai-numbers-that-surprised-me-this-week

Hallucination AA: https://artificialanalysis.ai/evaluations/omniscience

Melanie Mitchell: https://x.com/MelMitchell1/status/2022738363548340526
ARC-AGI-2: https://x.com/arcprize/status/2024522812728496470/photo/1

Chollet on Agentic Coding and ML: https://x.com/fchollet/status/2024519439140737442

METR Caveat: https://metr.org/notes/2026-01-22-time-horizon-limitations/

Talaas Fast: https://chatjimmy.ai/

Amodei Interview Continual learning: https://www.dwarkesh.com/p/dario-amodei-2?open=false#%C2%A7002942-is-continual-learning-necessary-how-will-it-be-solved

Metaculus FutureEval: https://www.metaculus.com/futureeval/

Next Vid to Watch: https://www.patreon.com/posts/what-you-need-to-150647292

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
The Two Best AI Models/Enemies Just Got Released Simultaneously
6 févr.· AI Explained Official Podcast
The two models that you will hear discussed for at least the next two months - Claude Opus 4.6 and GPT 5.3 Codex - just got released within 26 mins or each other. The full breakdown of around 250 pages of reports, with just the most interest moments, from the battle of which is best, Claude personhood, the surprising misbehaviour of Opus 4.6, and much more

https://assemblyai.com/aiexplained

Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

AI Insiders ($9): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
00:54 - Self-improvement?
02:44 - Knowledge Work
05:30 - Overly agentic behaviour
09:12 - Who Shouldn’t Use Claude Opus
11:39 - Step-change?
15:09 - Claude’s ‘Personhood’

Hassabis Roadmap: https://www.patreon.com/posts/hassabis-roadmap-149750869

Release of Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6
212 Page System Card: https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdf
Claude Code Tip: https://x.com/bcherny/status/2019475897691124107

GPT Codex 5.3: https://openai.com/index/introducing-gpt-5-3-codex/
System Card: https://openai.com/index/gpt-5-3-codex-system-card/

Browse Comp: https://arxiv.org/pdf/2504.12516v1
Finance Agent: https://www.vals.ai/benchmarks/finance_agent
Terminal Bench 2: https://arxiv.org/pdf/2601.11868
Vending Bench: https://andonlabs.com/blog/opus-4-6-vending-bench

My X post: https://x.com/AIExplainedYT/status/2016851303436095647

Anthropic Apology: https://x.com/ch402/status/2014066134194995256/photo/1

Altman rebuttal: https://x.com/sama/status/2019139174339928189
https://x.com/sama/status/2019140276246442089

4% of GitHub: https://x.com/dylan522p/status/2019490550911766763

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude AI Co-founder Publishes 4 Big Claims about Near Future: Breakdown
28 janv.· AI Explained Official Podcast
Anthropic's CEO, who has consistently predicted transformative AI will arrive before 2030, recently published a nearly 20,000-word essay outlining his vision of where AI is heading. The video gives you the highlights. The essay argues that scaling and recursion will advance AI from coding automation to full engineering automation, while warning of economic displacement within 1-2 years and China's trajectory toward AI-enabled totalitarianism. Additionally, Dario Amodei predicts that AI models will increasingly be understood as collections of distinct personas rather than monolithic systems.

80,000 Hours: https://www.youtube.com/watch?v=B54EQiuO1UU

Check out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:10 - Scaling to software engineers
06:11 - Permanent Underclass
10:18 - Totalitarian Nightmares
16:38 - Collection of Personas

Essay: https://www.darioamodei.com/essay/the-adolescence-of-technology

Physics Prediction: https://www.quantamagazine.org/is-particle-physics-dead-dying-or-just-hard-20260126/

Axios: https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

World GDP: https://data.worldbank.org/indicator/NY.GDP.MKTP.KD.ZG?end=2024&start=1961&view=chart

Demis Hassabis Counter: https://www.youtube.com/watch?v=q6fq4_uP7aM

Karpathy 80%: https://x.com/karpathy/status/2015883857489522876

Machines of Loving Grace: https://www.darioamodei.com/essay/machines-of-loving-grace

Anthropic LessWrong: https://www.lesswrong.com/posts/5aKRshJzhojqfbRyo/unless-its-governance-changes-anthropic-is-untrustworthy#1__In_private__Dario_frequently_said_he_won_t_push_the_frontier_of_AI_capabilities__later__Anthropic_pushed_the_frontier

Original Constitution: https://www.anthropic.com/news/claudes-constitution

New Constitution: https://www.anthropic.com/constitution

Kimi K2.5: https://x.com/Kimi_Moonshot/status/2016024049869324599

Societies of Thought, Google DeepMind Paper: https://arxiv.org/pdf/2601.10825

https://lmcouncil.ai/benchmarks

https://www.patreon.com/posts/our-new-age-of-133960279

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Anthropic: Our AI just created a tool that can ‘automate all white collar work’, Me:
14 janv.· AI Explained Official Podcast
A new tool, with code written by an AI model, has gone omega-viral: Claude Cowork. But is the hype justified? What do the stats say on productivity? Where is the truth in a sea of noise? What is truth? Can we handle the truth? Where's Nemo?

https://matsprogram.org/s26-aie

Check out my new app! https://lmcouncil.ai

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:12 - Claude Cowork
06:48 - Productivity Speed-up + jobs
09:33 - Comparing Models
12:00 - Brittle AI Paper

Cowork Intro: https://x.com/claudeai/thread/2010805682434666759

'All of it': https://x.com/bcherny/status/2010813886052581538

'AGI' Claims: https://x.com/deepfates/status/2004994698335879383

Douglas Interview: https://www.youtube.com/watch?v=TOsNrV3bXtQ&t=2313s

Job Stats: https://www.oxfordeconomics.com/wp-content/uploads/2026/01/Evidence-of-an-AI-driven-shakeup-of-job-markets-is-patchy.pdf
Amodei Prediction: https://fortune.com/2025/05/28/anthropic-ceo-warning-ai-job-loss/

GenAI Traffic: https://x.com/demishassabis/status/2009075877347512545

Illusion of Insight: https://arxiv.org/pdf/2601.00514
Entropy Exploration: https://arxiv.org/pdf/2506.14758
ProRL: https://arxiv.org/pdf/2505.24864

Genesis Mission: https://www.whitehouse.gov/presidential-actions/2025/11/launching-the-genesis-mission/
https://deepmind.google/blog/how-were-supporting-better-tropical-cyclone-prediction-with-ai/

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
What the Freakiness of 2025 in AI Tells Us About 2026
23 déc. 2025· AI Explained Official Podcast
It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.

http://matsprogram.org/s26-aie

My new app! https://lmcouncil.ai

Patreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094

Chapters:
00:00 - Introduction
00:34 - Reasoning Models … and limits
02:54 - A playable world
03:36 - Realism
03:50 - AI Slop gone mainstream
05:03 - DolphinGemma
05:39 - Public Mood
07:34 - AI Enlisted
08:30 - GPT-5
11:05 - Open Weight not out
13:00 - METR Breakout
17:30 - VASA-1
18:28 - Lateral Productivity
20:15 - 1 or 1000 benchmarks needed?
24:54 - Continual Learning + Altman on Superintelligence
28:08 - Automated Information Discovery ft AlphaEvolve

Hassabis on Generality: https://x.com/demishassabis/status/2003097405026193809
https://www.youtube.com/watch?v=PqVbypvxDto

Gemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
Reasoning Trade-offs: https://arxiv.org/pdf/2504.13837

DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09

Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/

METR Time Horizon: https://arxiv.org/pdf/2503.14499
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443
https://shash42.substack.com/p/how-to-game-the-metr-plot
https://x.com/METR_Evals/status/2002203627377574113

GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problems

https://simple-bench.com/

AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9k
https://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-plan

Survey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1

Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169

OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1
Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ

AI in Govt: https://x.com/jdcmedlock/status/1939814516503847259

Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/

AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=
Continual Learning: https://abehrouz.github.io/files/NL.pdf

Job Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

GPT4o: https://x.com/AISafetyMemes/status/1916889492172013989

Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/

Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines
Turing Test: https://x.com/tunguz/status/1907185471211422147

Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/

LLM Brainrot: https://arxiv.org/pdf/2510.13928

Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-report

Emotional Quotient: https://arxiv.org/pdf/2511.08394

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/

AI Insiders ($9!): https://www.patreon.com/AIExplained
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Gemini Exponential, Demis Hassabis' ‘Proto-AGI’ coming, but …
19 déc. 2025· AI Explained Official Podcast
The condensed highlights of hours of AI lab leader interviews, model releases, Gemini 3 Flash insights (plus it’s hidden flaw), Hassabis’ ‘proto-AGI’ and much more…

https://matsprogram.org/apply?utm_source=ai-explained&utm_medium=youtube&utm_campaign=s26

Also, do check out my new app: https://lmcouncil.ai

Chapters:
00:00 - Introduction
00:50 - Results
02:44 - But… the Flaw
04:49 - So Benchmarks are fake? No
07:37 - Spatial Reasoning + Hassabis
10:06 - Proto-AGI
12:07 - Minimal AGI
15:07 - Compute Slowdown
17:56 - New Data Paradigm

Gemini 3 Flash: https://deepmind.google/models/gemini/flash/

Hassabis Interview: https://www.youtube.com/watch?v=PqVbypvxDto
Legg Interview: https://www.youtube.com/watch?v=l3u_FAv33G0
Pre-training Lead Interview: https://www.youtube.com/watch?v=cNGDAqFXvew
Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ
Brockman Video: https://x.com/OpenAI/status/2001336514786017417
Post-Training Reveal: https://x.com/OfficialLoganK/status/2001742530472534442

Hallucinations Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
Patreon Hallucinations Vid: https://www.patreon.com/posts/blockers-to-and-139264812
AA-Omniscience Benchmark: https://artificialanalysis.ai/evaluations/omniscience
https://arxiv.org/pdf/2511.13029

lmcouncil.ai/benchmarks
https://simple-bench.com/
https://x.com/scaling01/status/1999620587744813205

5.2 Codex Drop: https://cdn.openai.com/pdf/ac7c37ae-7f4c-4442-b741-2eabdeaf77e0/oai_5_2_Codex.pdf

OpenAI Compute Trend: https://www.theinformation.com/articles/openais-350-billion-computing-cost-problem?rc=sy0ihq

Cramer Tweet/Response: https://x.com/BorisMPower/status/2001440650210976018

OpenAI Valuation: https://www.theinformation.com/articles/openai-discussed-raising-tens-billions-valuation-around-750-billion?rc=sy0ihq

Indian Data: https://www.reuters.com/world/india/with-freebies-openai-google-vie-indian-users-training-data-2025-12-17/

TheInformation Data: https://x.com/theinformation/status/2001421225751351778

Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/
Sima 2: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
Veo 3.1: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

METR: https://metr.org/blohttps://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

AI Insiders ($9!): https://www.patreon.com/AIExplained

Non-hype Newsletter: https://signaltonoise.beehiiv.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
GPT 5.2: OpenAI Strikes Back
12 déc. 2025· AI Explained Official Podcast
Full GPT-5.2 breakdown - did OpenAI reclaim the crown? A story of tokens, time and cost, plus 9 details you wouldn’t get just from reading the headlines.

https://www.youtube.com/@eightythousandhours

AI Insiders ($9!): https://www.patreon.com/AIExplained
https://lmcouncil.ai

Chapters:
00:00 - Introduction
00:55 - Better than Human @ Professional Tasks?
04:42 - Test time Compute
07:05 - Benchmark Selection
09:32 - Simple Results + council comparison
13:01 - Long Context
13:52 - Self-Improvement
15:00 - 10 Years + New Models

Release Page: https://openai.com/index/introducing-gpt-5-2/

GPT 5.2 Benchmark Comparison: https://www.reddit.com/r/singularity/comments/1pka1y9/gpt52_all_20_benchmarks_rankings_and_pricing/
https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
https://lmcouncil.ai/benchmarks

Charxiv: https://charxiv.github.io/#leaderboard

GDPval: https://arxiv.org/pdf/2510.04374
My vid: https://www.youtube.com/watch?v=oK5LxMaROSA

Kilpatrick: https://x.com/OfficialLoganK/status/1999270402712023158/photo/1

Noam Brown: https://x.com/polynoamial/status/1999189845164667132

New Model in New Year: https://www.theinformation.com/articles/openai-developing-garlic-model-counter-googles-recent-gains?rc=sy0ihq

10 Years of OpenAI: https://openai.com/index/ten-years/

GPQA: https://x.com/idavidrein/status/1841265634170278063

ARC-AGI 1-2: https://arcprize.org/arc-agi/2/

Sunday Robotics: https://x.com/tonyzzhao/status/1991204839578300813

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

https://lmcouncil.ai
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
You Are Being Told Contradictory Things About AI: 8 examples
5 déc. 2025· AI Explained Official Podcast
With headlines of an imminent job apocalypse, code red for ChatGPT and recursive self-improvement, at the same time as Anthropic's CEO yesterday saying we know how to scale to AGI, and Gemini 3 DeepThink out today, it is easy to get lost among the narratives and counter-narratives. So here are both, plus the facts behind them, for you to decide.

https://epoch.ai/data/data-centers

Epoch AI is the sponsor of today’s video, and my views, and those expressed in this video, do not necessarily reflect Epoch AI’s views in any way.

Chapters:
00:00 - Introduction
00:42 - Job Apocalypse?
01:45 - Scaling to AGI
04:15 - Recursive Self-Improvement Needed, or Not
09:57 - OpenAI Code Red vs Gemini 3 DeepThink vs Claude Opus 4.5
13:27 - DeepSeek Speciale vs Mistral Large v3
16:45 - Claude Soul Document

https://lmcouncil.ai/

AI Insiders ($9!): https://www.patreon.com/AIExplained

Guardian Interview: https://www.theguardian.com/technology/ng-interactive/2025/dec/02/jared-kaplan-artificial-intelligence-train-itself

MIT Study on Jobs/Tasks: https://iceberg.mit.edu/report.pdf
vs https://www.cnbc.com/2025/11/26/mit-study-finds-ai-can-already-replace-11point7percent-of-us-workforce.html

Amodei on Scaling: https://www.youtube.com/watch?v=FEj7wAjwQIk
Claude Soul Document: https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5-opus-soul-document

Capabilities Original Stance: https://www.anthropic.com/news/core-views-on-ai-safety

Ilya Interview: https://www.dwarkesh.com/p/ilya-sutskever-2

Ricursive Intelligence: https://x.com/RicursiveAI/status/1995932204703346946

Economist Worker Usage of GenAI: https://www.economist.com/finance-and-economics/2025/11/26/investors-expect-ai-use-to-soar-thats-not-happening#selection-1409.94-1413.42

Mistral v3 Large: https://docs.mistral.ai/models/mistral-large-3-25-12

Compute Slowdown Paper: https://joel-becker.com/images/publications/forecasting_time_horizon_under_compute_slowdown.pdf
https://x.com/joel_bkr/status/1993023436541903155

METR Chart: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

https://www.theinformation.com/articles/openais-350-billion-computing-cost-problem?rc=sy0ihq

OpenAI Code Red: https://www.anthropic.com/news/core-views-on-ai-safety
Rocket Company: https://www.independent.co.uk/news/world/americas/sam-altman-rocket-elon-musk-spacex-b2878351.html

DeepSeek Paper: https://arxiv.org/html/2512.02556v1

DeepSeek Crowdstrike CCP: https://www.crowdstrike.com/en-us/blog/crowdstrike-researchers-identify-hidden-vulnerabilities-ai-coded-software/

https://simple-bench.com/

Patreon Post: https://www.patreon.com/c/aiexplained/posts

Robot: https://x.com/jloganolson/status/1985850115379351799
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Gemini 3 is Here: 11 Details You Might Have Missed
19 nov. 2025· AI Explained Official Podcast
Gemini 3 Pro is out, and records fell like snowflakes in Svalbard.

No long description, chapters or links today, huge technical difficulties, including with audio, so just want to publish asap.

https://app.grayswan.ai/ai-explained

https://lmcouncil.ai
AI Insiders ($9!): https://www.patreon.com/AIExplained

Non-hype Newsletter: https://signaltonoise.beehiiv.com/
Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
14 nov. 2025· AI Explained Official Podcast
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.
https://assemblyai.com/aiexplained
Chapters:
00:00 - Introduction
00:56 - GPT 5.1 Smarter?
01:47 - Some Regressions
03:22 - Sycophancy?
05:22 - Claude Auto-Hacking
06:16 - Jailbreaking through Granularity
08:22 - This Will be Re-used
09:30 - Hallucinating Hacker
09:57 - Surprisingly Neutral Tone
12:18 - SIMA 2
14:10 - Alpha Parallels
17:24 - AI Music

GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/
System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf
Benchmarks: https://openai.com/index/gpt-5-1-for-developers/
Simple Bench: https://lmcouncil.ai/benchmarks
Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618
https://www.anthropic.com/news/disrupting-AI-espionage
Report: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdf

Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
https://x.com/amoufarek/status/1988986075331858693
Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/
Voyager: https://voyager.minedojo.org/
Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)
10 nov. 2025· AI Explained Official Podcast
Don’t let headlines about bubbles distract you from the real avenues of progress being explored in AI every week, including what had been thought to be a long-term blocker - continual learning (learning on the fly).

https://app.grayswan.ai/ai-explained

This, plus models introspecting (hesitate before you berate), Nano Banana 2 possibly spotted, Chinese imagen and more.

AI Insiders ($9!): https://www.patreon.com/AIExplained

Chapters:
00:00 - Introduction
01:26 - Continual Learning (Nested Learning / HOPE)
07:00 - Introspection
10:54 - Image-Gen Progress

Nested Learning Post: https://research.google/blog/introducing-nested-learning-a-new-ml-paradigm-for-continual-learning/

Nested Learning Paper: https://abehrouz.github.io/files/NL.pdf

Original Titans Paper: https://arxiv.org/pdf/2501.00663

Siri News: https://www.bloomberg.com/news/articles/2025-11-05/apple-plans-to-use-1-2-trillion-parameter-google-gemini-model-to-power-new-siri

Introspection: https://www.anthropic.com/research/introspection

Full Paper: https://transformer-circuits.pub/2025/introspection/index.html#mechanisms

Earlier Work: https://www.anthropic.com/research/mapping-mind-language-model
https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

Release Post: https://x.com/AnthropicAI/status/1983584136972677319

https://lmcouncil.ai

Non-hype Newsletter: https://signaltonoise.beehiiv.com/

Podcast: https://aiexplainedopodcast.buzzsprout.com/
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Montre plus

Episodes

Claude Fable 5 - Full 319 page Breakdown

New Claude - 244 page breakdown

Two Rival Bets on AGI: Google I/O Highlights

GPT 5.5 Arrives, DeepSeek V4 Drops, and the Compute War Intensifies

Claude Opus 4.7 - A New Frontier, in Performance … and Drama

Claude Mythos: Highlights from 244-page Release

OpenAI Spud, a Claude Model set to ‘stir governments’, Beast Mode ARC-AGI-3

What the New ChatGPT 5.4 Means for the World

Deadline Day for Autonomous AI Weapons & Mass Surveillance

Gemini 3.1 Pro and the Downfall of Benchmarks: Welcome to the Vibe Era of AI

The Two Best AI Models/Enemies Just Got Released Simultaneously

Claude AI Co-founder Publishes 4 Big Claims about Near Future: Breakdown

What the Freakiness of 2025 in AI Tells Us About 2026

Gemini Exponential, Demis Hassabis' ‘Proto-AGI’ coming, but …

GPT 5.2: OpenAI Strikes Back

You Are Being Told Contradictory Things About AI: 8 examples

Gemini 3 is Here: 11 Details You Might Have Missed

Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that

Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)