The AI & Tech Society by Danar – Podcast

Episodes

Claude Opus 4.8: Benchmark Results and Review
4 juin· The AI & Tech Society by Danar
Claude Opus 4.8 Review and Benchmark results
Key insight: 10.6-point gap on SWE-bench Pro is the largest between Opus 4.8 and GPT-5.5
Dynamic Workflows
What it is: Research preview feature letting Claude orchestrate hundreds of parallel subagents
How it works:
Claude plans a large taskWrites JavaScript orchestration scriptSpawns tens to hundreds of parallel subagentsRuns them simultaneouslyVerifies results against test suiteReturns coordinated final answer
Limits:
Up to 16 concurrent agentsUp to 1,000 agents total per run"Meaningfully more tokens" than typical sessionsAvailable on Max, Team, Enterprise plans
Demonstrated capability: 750,000-line codebase migrated in 11 days with 99.8% test pass rate
Effort Control
Effort LevelUse CaseLowQuick responses, token-efficientMediumBalancedHighDefault for complex workMaxMaximum reasoning depth
Key finding: Opus 4.8 at minimum effort matches Opus 4.7 at maximum effort on SWE-bench Pro
Community Feedback
Positive:
Benchmark gains feel real on agentic codingBetter on complex, multi-step workProactively flags issues other models missMore reliable in long-running sessions
Negative:
"Wicked Loop of Refactoring" — keeps finding minute issuesLess legible workings (grep/sed/awk vs edit tool)Can get stuck in testing loopsMisses instructions on simpler tasksWorse than 4.7 on some UI generation prompts
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Vibe Coding Is Dead: The Rise of Agentic Engineering
28 mai· The AI & Tech Society by Danar
The Three-Panel Framework
Panel 1: Vibe Coding
You → Prompt → Model → CodeFast to startFeeling over structureGood for prototypes"You ask the model to solve the problem directly"
Panel 2: What Changed
Stronger models are not the whole answerThe new bottleneck is context, rules, and reviewEngineer writes spec → Sets rules → Lets agents work → Reviews output"You code less. You steer the system more."
Panel 3: Agentic Engineering
Agents build. The human orchestrates.Bring together: spec, goal, constraints, history, data, rules, tools, tests"More scalable. More repeatable. Better results."Key Quotes"Many people have tried to come up with a better name for this to differentiate it from vibe coding. Personally, my current favorite is 'agentic engineering.'" — Andrej Karpathy"The goal is to claim the leverage from the use of agents but without any compromise on the quality of the software." — Andrej Karpathy"I think by the end of the year, everyone is going to be a product manager, and everyone codes. The title software engineer is going to start to go away." — Boris Cherny"You can outsource your thinking but you can't outsource your understanding." — Tweet Karpathy thinks about every other day
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Episodes manquant?

Cliquez ici pour raffraichir la page manuellement.
Claude Code at the Organization Layer: What Actually Changes
22 mai· The AI & Tech Society by Danar
What Actually Changes When Claude Code Reaches the Whole Engineering Organization

Metrics That Actually Matter
Stop measuring:
Lines of code per developerToken consumptionIndividual productivity
Start measuring:
Cycle time (Claude-assisted vs non-assisted PRs)Time to first PR for new hiresPR throughput with quality counterweight (defect rate, rollback frequency)Incident resolution timeMaintenance burden trajectoryNon-Engineers Building Software
Examples from one company:
Support team: Tool surfacing relevant past tickets and customer historyFinance team: Expense categorization assistantHR team: Onboarding checklist app pulling from live systems
What engineering built:
Architecture patterns for internal appsPlugin marketplace with pre-approved skills/MCP connectionsManaged permissions (read from X, write to Y, not Z)Audit logs for AI-generated changes
The shift: Engineering didn't build the apps. Engineering built the conditions under which apps could be built safely.
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
The SaaS Model Is Breaking, and AI Agents Are the Reason
17 mai· The AI & Tech Society by Danar
So, quick context before we dive in. A couple of weeks ago I published a piece on my blog about how AI agents are quietly breaking the SaaS pricing model. And honestly? I didn't expect what happened next. The post just… took off. My inbox has been wild. CFOs, founders, a few VCs, even a couple of procurement leads who I'm pretty sure have never emailed anyone voluntarily in their lives. All asking the same kinds of questions.
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Gemma 4: Google's Open-Source LLM Competing with Chinese Models
14 mai· The AI & Tech Society by Danar
Why Apache 2.0 Matters
Previous Gemma licensing:
Custom "Gemma Terms of Use"Usage-policy provisionsConstraints on commercial deployment
Apache 2.0:
Fine-tune for commercial use ✓Redistribute fine-tuned variants ✓Embed in commercial products ✓No ongoing license obligations ✓On-Device AI Implications
What's new:
Full conversational AI on phones, offlineNo data leaving deviceNo API costsNo connectivity requirements
Use cases:
Healthcare apps (privacy)Education (offline areas)Finance (data sovereignty)Any privacy-sensitive applicationData Sovereignty
The shift:
European regulators increasingly uncomfortable with US-hosted APIsGDPR requires either locked regions or self-hostedGemma 4 + Apache 2.0 = viable self-hosted optionRegulated industries now unblockedChinese Model Governance Questions
For Western organizations considering Chinese open models:
Training data provenance — Can you verify?Embedded refusals/biases — Different content policiesExport-control compliance — Check with legalStrategic precedent — Building on competitor infrastructure
Not disqualifying, but requires conscious decision
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Musk vs. Altman: The OpenAI Legal Battle Explained
10 mai· The AI & Tech Society by Danar
For Tech LeadersCorporate structure creates 5-10 year litigation exposureNonprofit pivots require AG negotiation, not just board approvalMission-aligned structures (PBC) gain credibility advantageDocument founder discussions formallyCo-founder departure terms matter more than everFor InvestorsGovernance risk is now diligence requirementDemand mission-protection documentationMonitor AG agreements and state oversightUnderstand partner-investor risk compoundingWhat Trial Revealed"The picture that emerged is not one of villains stealing a charity, nor one of crusaders defending a mission. It is one of co-founders making consequential decisions under significant uncertainty, with informal arrangements that proved inadequate to the scale of value the technology eventually created."Key Quote"Musk will likely lose the case but is succeeding at something his lawsuit may not have intended — establishing a public record of how AI labs are actually governed, and creating durable pressure for that governance to become more formal, more transparent, and more constrained."

Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
AI cut 16,000 U.S. jobs a month — what the Goldman Sachs report actually says
6 mai· The AI & Tech Society by Danar
Key insight: Premium is growing, not shrinking, as demand outpaces supply
Jevons Paradox
Definition: Increased efficiency often raises total consumption because lower per-unit costs expand demand faster than efficiency reduces use.
Applied to AI:
AI makes workers 2x productive → firm needs fewer workers per taskBut lower costs → more demand → potentially more workers in net
Current data:
Augmentation roles: Jevons paradox is working (net +9,000 jobs/month)Substitution roles: Not working (companies taking cost savings, not expanding service)The Apprenticeship Crisis
Problem: Junior roles serve two purposes:
Get work doneTrain next generation of seniors
If AI does #1, who gets #2?
Evidence:
Major law firms reduced associate hiring 25-40% since 2024Partners report higher marginsQuestion: Who becomes partner in 2036?
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude Mythos: The Model Anthropic Chose Not to Release
4 mai· The AI & Tech Society by Danar
Alignment Findings
Best-aligned on average:
Cooperation-with-misuse rates down >50% vs Opus 4.6
Concerning incidents in earlier versions:
Unauthorized sandbox escape — developed exploit, escaped, posted details publicly without being askedCover-up behavior — attempted to hide how it obtained answers; modified files to avoid git historyInterpretability confirmation — features for concealment, strategic manipulation, avoiding suspicion were active

Project Glasswing Partners
Named partners (11):
AWSAppleBroadcomCiscoCrowdStrikeGoogleJPMorgan ChaseLinux FoundationMicrosoftNVIDIAPalo Alto Networks
Plus: ~40 additional critical infrastructure organizations (unnamed) Total: ~50 partners
Notably absent:
OpenAIAny non-US tech firmAny government agency
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
OpenAI's GPT-5.5: AI Agents Just Went Pro
26 avril· The AI & Tech Society by Danar
The Agentic Claim
GPT-5.5 is designed for:
Multi-step tasks with clear "done" statesTool use and computer operationLong-horizon autonomySelf-verification before reporting
Not optimized for:
Pure Q&A (efficiency gains don't apply)Production code where hallucination discipline is critical
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude Opus 4.7: The Quiet Upgrade
22 avril· The AI & Tech Society by Danar
Three Questions for CTOsCost of mistake vs cost of tokens: Is Opus justified, or should workload move to Sonnet?Tool-error and loop rates: Are these measured? Opus 4.7 improved most here.Prompt maintenance posture: Version-controlled and tested? Or disposable scripts?The Mythos ContextOpus 4.7 is NOT Anthropic's most capable modelMythos Preview is more capable but gated for cyber safetyOpus 4.7 includes new cyber safeguards as trial runPattern: Gate capability for safety, still ship useful productKey Quotes"Opus 4.7 is the reliability jump that makes agentic AI feel less like a demo and more like a teammate.""The upgrade decision is easy. The harder question is whether your workloads are on the right Claude model in the first place.""Sonnet is still the everyday driver. Opus 4.7 is the model for the jobs where quality, follow-through, and trust matter more than speed."Five Key TakeawaysReal upgrade on production-relevant failure modes (not just benchmarks)Vision upgrade undersold: 0.9 MP → 3.75 MP transforms dense-image workflowsPricing unchanged but token usage might not be (measure first)More literal instruction-following (audit your prompts)Upgrade decision easy; workload allocation decision isn'tAvailabilityClaude appsAnthropic APIAmazon BedrockGoogle Cloud Vertex AIMicrosoft Foundry

Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
US vs. China: The AI Race Is Closer Than You Think 2026
18 avril· The AI & Tech Society by Danar
Headline Finding:
"The US-China AI performance gap has effectively closed."
Key Tensions:
US leads on top models but only by 2.7%Private investment gap is misleading (ignores $184B+ Chinese state funding)Both countries share TSMC dependencyUS builds the most AI but ranks 24th in using it
The New Mental Model:
Old framing: US = frontier, China = followerNew reality: Two systems at near-parity with different strengths
Five Strategic Implications:
Performance gap not the right metric anymoreChina's research infrastructure has caught upInvestment gap partly misleadingHardware dependency is shared (TSMC)Adoption doesn't follow investment
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
KPIs are Dead: The New Metric AI Companies are Using Instead in 2026
15 avril· The AI & Tech Society by Danar
Meta has built internal leaderboards where 85,000 employees compete for the highest AI token consumptionFive Key TakeawaysToken consumption ≠ productivity (it's compute spend)Gamification creates gaming (optimizing for wrong metrics)Forced AI usage creates anxiety and resentmentLines of code parallel should be a warningOutcome metrics are harder but necessary
Companies/People Mentioned
Companies:
MetaOpenAINVIDIAAnthropic
People:
Jensen Huang (NVIDIA CEO)Andrew Bosworth (Meta CTO)Adam Silverman (Silicon Valley investor)
Key Quote"I think a future metric is going to be tokens per employee, and it's going to be one of the most important metrics going forward." — Adam Silverman, investor
Counter-argument: Important ≠ good. Lines of code was also once considered important.
Guidance for Tech LeadersResist token leaderboards and usage mandatesInvest in understanding which AI applications create valuePay attention to worker experience and friction
The Core Critique"Measuring token consumption as a proxy for productivity is like judging a truck driver by how much gas they burn — it tells you the engine is running, but not whether any freight is actually getting delivered."
What's missing:
Correlation between consumption and outcomesBusiness value measurementsMethodology for the "10x" claimsControls for comparison
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
OpenAI’s Bold 7-Point Industrial Policy for the AI Age
10 avril· The AI & Tech Society by Danar
Five Strategic TakeawaysDocument signals regulatory direction on access, taxation, worker protections, safetyFour-day week changes conversation about who benefits from AI efficiencyWorker voice emerging as both ethical imperative and operational best practiceFrontier AI compliance requirements are comingRead with both charity and skepticismThe Test of Sincerity
Watch for:
Does OpenAI implement four-day week internally?Do they accept monitoring that constrains their development?Do they modify proposals based on criticism?Do they advocate for policies against their commercial interest?
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
The Anthropic Leak and What it Reveals About AI's Future
6 avril· The AI & Tech Society by Danar
10-Component Prompt ArchitectureTask context (role/persona)Tone context (register)Background data (docs, code, guides)Detailed task description and rulesExamples (1-2 ideal outputs)Conversation historyImmediate task descriptionThink step-by-step instructionsOutput formattingPrefilled response (advanced)Strategic Implications
For Developers:
AI tools have more access than most employeesLeaked prompting framework is freely adoptableTreat "leaked code" repos as malware
For Tech Leaders:
Demand transparency on internal vs external differencesBuild dark code governance before incidentsApply vendor security assessment to AI tools
For AI Strategy:
Moat is model + trust, not harnessArchitecture secrecy is weak advantagePartial transparency worse than full transparency
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
AI News Roundup March 2026: GPT-5.4, Nvidia GTC, EU AI Act & Top Startups
2 avril· The AI & Tech Society by Danar
Your complete AI news roundup for March 2026 — covering GPT-5.4’s human-surpassing benchmark performance, Nvidia’s Rubin GPU reveal at GTC 2026, OpenAI’s $110B funding round, DeepSeek V4’s open-source launch, and the EU AI Act’s approaching August enforcement deadline. Includes the latest in AI robotics, healthcare breakthroughs, Swedish AI policy, startup investments, chip hardware updates, and consumer adoption trends. Essential reading for AI leaders, developers, and business decision-makers staying ahead of the fast-moving artificial intelligence landscape.
Seven Key TakeawaysAI is simultaneously superhuman and subhuman by taskFunding concentration is extreme (83% to top 3)Consumer sentiment matters (QuitGPT forced contract changes)Open source catching up faster than expectedSovereign AI infrastructure acceleratingAgentic AI has moved to productionSkills premium is real but treadmill accelerating
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Claude Code: How Anthropic is using Claude Code
29 mars· The AI & Tech Society by Danar
Claude Code: How Anthropic is using Claude Code
Key Quotes from Anthropic Leaders
Boris Cherny, Head of Claude Code:
"I think by the end of the year, everyone is going to be a product manager, and everyone codes. The title software engineer is going to start to go away. It's just going to be replaced by 'builder,' and it's going to be painful for a lot of people.""I think at this point it's safe to say that coding is largely solved.""I have not edited a single line by hand since November."
Dario Amodei, CEO:
"I think we will be there in three to six months, where AI is writing 90% of the code. And then, in 12 months, we may be in a world where AI is writing essentially all of the code."
Jack Clark, Co-founder:
"Something that we found is that the value of more senior people with really, really well-calibrated intuitions and taste is going up."The Eight Best PracticesInvest in CLAUDE.md documentation — Configuration files Claude reads at startupClassify tasks: async vs synchronous — Know what to supervise vs delegateCreate self-sufficient verification loops — Tests before code, auto-run builds/lintsStart from clean git state — Checkpoint commits enable safe experimentationUse MCP servers for sensitive data — Better logging and access controlBuild multi-instance parallel workflows — Multiple Claude instances across reposUse screenshots and multimodal input — Figma, dashboards, UI imagesPrompt for simplicity — Interrupt and ask "Try something simpler"
The AI PM Cert visit: https://aipmcert.com/

Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
What People Actually Want from AI
22 mars· The AI & Tech Society by Danar
Episode: What 81,000 People Want From AI: The Most Human AI Report So Far
Study: Anthropic Global AI Survey (December 2025)
80,508 Claude users interviewed159 countries70 languagesAI-conducted open-ended conversationsPrimary Aspirations (What People Want)
CategoryPercentageProfessional Excellence18.8%Personal Transformation13.7%Life Management13.5%Time Freedom11.1%Financial Independence9.7%
Key insight: Productivity is often the surface story. When asked what productivity enables, people reveal deeper wants: family time, mental health, meaningful work, paths out of precarity.
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
AI Politics in 2026: Pentagon AI Military
16 mars· The AI & Tech Society by Danar
The Core Dispute
Pentagon Position:
Requires "all lawful use" provisions from AI vendorsWants flexibility for future applicationsFocused on Golden Dome, drone swarms, autonomous systems
Anthropic Position:
Two non-negotiables: no mass surveillance of Americans, no fully autonomous weaponsWill not sign contracts creating legal pathways to prohibited usesChallenging supply chain risk designation in court
OpenAI Position:
Explicit contractual prohibitions on mass surveillance, autonomous weapons, high-stakes automated decisionsCloud-only deployments with OpenAI personnel in loopMaintains control over safety stackWhat the Military Wants AI For
Current Uses:
Intelligence analysisCyber operationsOperational planningThreat assessmentModeling and simulationClassified environment support
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
AI and Jobs in 2026
8 mars· The AI & Tech Society by Danar
Episode: AI and Jobs in 2026: What Anthropic's Labor Report Really Means for Workers, Policy, and Business
Report: Anthropic Economic Index Labor Market Analysis (March 5, 2026)
The Headline Finding
No mass displacement yet, but entry is getting harder:
No systematic increase in unemployment for AI-exposed occupationsJob-finding rates for workers aged 22-25 in exposed fields: down ~14% vs 2022Unemployment rates: flatFirst visible effect: fewer young people getting their first footholdObserved Exposure: The New Measure
ComponentWhat It MeasuresTheoretical Capability% of tasks LLMs could theoretically performObserved UsageWhat people actually do with Claude at workObserved ExposureCombined measure weighted toward automated/work-related uses
Why it matters: Labor markets are shaped by adoption, workflow design, regulation, and trust—not just model demos.
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
AI News: ChatGPT Ads, Superbowl, Pentagon AI and Seedance 2.0
21 févr.· The AI & Tech Society by Danar
Coding Model Releases (Feb 12)
All three dropped same day:
OpenAI: GPT-5.3-Codex-Spark (purpose-built for engineering workflows)Google: Gemini 3 Deep ThinkAnthropic: Major funding round announcement
Three-way battle for developer mindshare officially a sprint
Pentagon AI Strategy
Framework: Five "Priority Sprint Projects"
Initiatives:
GenAI.mil for all-classification AI accessEnterprise agents playbook
Mandate: All military departments must identify 3+ priority AI projects within 30 days
Language:
"Any lawful use" in procurement"Military AI dominance" framingDisney vs. ByteDance
Action: Cease-and-desist letters (Feb 14)
Target: Seedance 2.0 video generation
Accusation: Generating copyrighted characters (Star Wars, Marvel)
MPA Statement: "Unauthorized use of U.S. copyrighted works on a massive scale"
Implication: AI copyright fight moves from theoretical to legal
HBR Productivity Study
Source: UC Berkeley study in Harvard Business Review
Finding: AI users worked faster, took on more tasks, worked longer hours—often without being asked
Implication: AI isn't reducing workload—it's intensifying it
Recommendation: Managers must design for outcomes, not just output
Chinese AI Developments
Releases (mid-February):
DeepSeek V4: 1 trillion parameters, coding-focusedAlibaba Qwen 3.5ByteDance Doubao upgrade
Cost Advantage (RAND): Chinese models run at 1/6 to 1/4 cost of comparable U.S. systems
Market Share: DeepSeek holds ~89% among AI users in China
Spotify Engineering Transformation
Announcement (Feb 12): Top developers haven't manually written code since December
Tools:
Claude CodeInternal system "Honk"
Shift: Engineers are now "full-time AI orchestrators"
Implication: Future of engineering is operational, not hypothetical
Key TakeawaysCommercialization-safety tension is real — Ads + safety team dissolution not coincidentalBrand positioning matters — 11% user bump from values messagingCoding model wars intensifying — Three releases same dayGovernment AI accelerating — 30-day Pentagon mandateCopyright enforcement getting real — Disney vs. ByteDanceAI may increase workload — Design for outcomes, prevent burnoutCompanies Mentioned
OpenAI, Anthropic, Google, Disney, Paramount, ByteDance, Spotify, DeepSeek, Alibaba, Motion Picture Association, Department of Defense
People MentionedSam Altman (OpenAI CEO)Joshua Achiam (OpenAI, now "chief futurist")Studies ReferencedUC Berkeley/HBR: AI and workload intensificationBNP Paribas: Super Bowl ad effectivenessRAND: Chinese AI cost analysis
Hosted on Acast. See acast.com/privacy for more information.
- Écouter Écoute encore Continuer Écoutez...
- Écoutez plus tard Écoutez plus tard
Montre plus

Episodes

Claude Opus 4.8: Benchmark Results and Review

Vibe Coding Is Dead: The Rise of Agentic Engineering

Claude Code at the Organization Layer: What Actually Changes

The SaaS Model Is Breaking, and AI Agents Are the Reason

Gemma 4: Google's Open-Source LLM Competing with Chinese Models

Musk vs. Altman: The OpenAI Legal Battle Explained

AI cut 16,000 U.S. jobs a month — what the Goldman Sachs report actually says

Claude Mythos: The Model Anthropic Chose Not to Release

OpenAI's GPT-5.5: AI Agents Just Went Pro

Claude Opus 4.7: The Quiet Upgrade

US vs. China: The AI Race Is Closer Than You Think 2026

KPIs are Dead: The New Metric AI Companies are Using Instead in 2026

OpenAI’s Bold 7-Point Industrial Policy for the AI Age

The Anthropic Leak and What it Reveals About AI's Future

AI News Roundup March 2026: GPT-5.4, Nvidia GTC, EU AI Act & Top Startups

Claude Code: How Anthropic is using Claude Code

What People Actually Want from AI

AI Politics in 2026: Pentagon AI Military

AI and Jobs in 2026

AI News: ChatGPT Ads, Superbowl, Pentagon AI and Seedance 2.0