Episoder
-
We just redefined 'fast'.
In less than one revolution of the Earth, boffins announced a triple upgrade to humanity’s operating system:
AI that writes DNA like Shakespearean sonnets, and A quantum chip stable enough to crack encryption in less time than it took you to make a brew, and A digital lab partner rediscovering decade-old scientific truths in 48 hours.This isn’t evolution – it’s a full-system reboot.
From curing cancers we’ve not yet named to materials that laugh in the face of physics, we’re witnessing the birth of what historians will call The Second Enlightenment.
But mind the ethical quicksand – with great power comes even greater “blimey, did we just do that?” moments.
Sources (of which there are dozens) include:
https://github.com/ArcInstitute/evo2 https://arcinstitute.org/news/blog/evo2 https://research.google/blog/accelerating-scientific-breakthroughs-with-an-ai-co-scientist/ https://www.imperial.ac.uk/news/261293/googles-ai-co-scientist-could-enhance-research/ https://news.microsoft.com/source/features/ai/microsofts-majorana-1-chip-carves-new-path-for-quantum-computing/ https://www.reddit.com/r/singularity/comments/1it9hkv/majorana_1_microsofts_quantum_breakthrough_to/Fancy sponsoring the show - or talking about your work in AI?
Drop me a line at [email protected]
-
Oh, Barbie Girl. Anyone remember Aqua? And in one fell swoop, the AI Today podcast lost 49 subscribers...
Ready for a masterclass in competitive advantage?
Picture this: AI’s next battleground isn’t just chatbots—it’s code, chemistry, and control.
We’ll dissect the guerrilla war brewing in consumer AI, where Grok3’s “Colossus” supercomputers clash with OpenAI’s Deep Research and Meta’s open-source playbook. How will YOU choose your side when every tool becomes a weapon?
We’ll crack open the secret workflow turning solo devs into coding warlords. A refreshing change from the doom and gloom about AI's coding ineptitude!
Listen for the wildcard: Mira Murati’s Thinking Machines Lab, staffed by OpenAI’s exiles, is betting that customizable, collaborative AI will outflank ChatGPT itself.
But the real bombshell? AI is now designing enzymes that eat plastic—a breakthrough rewriting the rules of synthetic biology.
Join us to decode the future for businesses ready to outthink, outbuild, and outpace.
Articles covered in today's show:
https://spyglass.org/consumer-ai-services-apps/ https://harper.blog/2025/02/16/my-llm-codegen-workflow-atm/ https://arstechnica.com/science/2025/02/using-ai-to-design-proteins-is-now-easy-making-enzymes-remains-hard/ https://techcrunch.com/2025/02/18/thinking-machines-lab-is-ex-openai-cto-mira-muratis-new-startup/ -
Manglende episoder?
-
Imagine a world where software engineers are replaced by other software engineers that are entirely digital.
No coffee breaks, no office politics, just pure, unadulterated code. It sounds like science fiction, doesn't it?
But the question is: how far off is it, really?
That's the question a team of researchers sought to answer with the SWE-Lancer benchmark.
They didn't just want to test an AI's ability to write snippets of code.
They wanted to see if a large language model, an LLM, could actually earn a living as a freelance software engineer - the ultimate test of practical AI coding ability.
Think about it. Freelancing is the ultimate test. You're judged solely on your output. There's no hiding behind a team. You have to deliver, or you don't get paid. So, the researchers took real-world freelance jobs from Upwork, a popular platform for freelancers, and fed them to some of the most advanced LLMs available.
These weren't simple tasks. They involved understanding complex requirements, navigating existing codebases, and often, making engineering management decisions.
The kind of decisions that usually require years of experience.
The results? Well, they were… sobering.
GPT-4 successfully completed only 10.2% of the coding tasks. Claude 3.5 fared slightly worse, at 8.7%.
And when it came to those crucial management decisions? GPT-4's accuracy was a mere 21.4%.
These numbers highlight the significant gap between theoretical AI prowess and real-world problem-solving.
Let those numbers sink in. Even the best AI models (of the time, but including what's considered by many coders as the best of the bunch) struggled to complete even a fifth of the tasks a human freelancer would routinely handle.
This isn't to say AI is useless in software engineering. Far from it. But it highlights a crucial gap – the gap between theoretical capability and practical application.
AI models were tested on the entire workflow a freelancer might face, including tasks that go far beyond just writing code.
The study revealed several key weaknesses. Many errors stemmed from the LLMs misunderstanding the requirements. Others came from incorrectly handling API calls or failing to adapt to the existing codebase.
These are all areas where human engineers, with their years of experience and contextual understanding, excel.
However, it's crucial to note that the field of AI is rapidly evolving, and performance on specific benchmarks can change quickly as models are updated and refined.
But the story doesn't end there.
Researchers also identified specific areas where AI did show promise. LLMs were relatively good at writing new code from scratch - but struggled with modifying existing code, which often requires a deep understanding of the original programmer's intent.
This suggests that AI might be best suited, for now, to tasks that involve generating new content, rather than those requiring complex reasoning and adaptation.
Think of AI as a junior developer, capable of handling well-defined tasks, but needing guidance and oversight from a more experienced (human) engineer.
This also highlights the need for improved training data and techniques that allow LLMs to better understand and reason about existing codebases.
-
This morning we woke to a new dawn for AI.
Grok-3 is a cutting-edge large language model (LLM) from Elon Musk's XAI, designed to understand and generate human-like text, outperforming competitors in several tasks.
It's trained using innovative techniques like synthetic data sets, self-correction mechanisms, and human feedback loops, ensuring accuracy and relevance.
Grok-3 demonstrates impressive capabilities across various fields, including:
Accelerating drug discovery and protein ligand binding predictions.
Modernizing legacy software and reducing bug resolution times.
Predicting legal case outcomes and summarising patent filings.
Personalising education for autism spectrum learners and improving physics comprehension in rural schools.
Collaborating on creative projects like sci-fi novellas and robot choreography.
But how? Why? And what's next?
Let's break it down with your ever-faithful podcast superstar hosts - Bert and Sandra!
-
There are so many examples of ballsy businesses finding gems in their mountains of documents and datasets, thanks to the superpowered excavator that's AI, that cynics and sceptics are heading for the hills.
On today's show we sniff out a smattering of these tech-infused titans to find out how they're riding roughshod over the competition.
AI's real power lies in identifying patterns humans might miss, such as subtle trends in customer reviews or unexpected customer preferences.
And it's getting cheaper all the time. Which means your business can't afford to miss out. Listen in to this episode of AI Today and prepare to get involved in the ever-more exciting world of AI and data!
-
I've pulled together some really simple ways to make AI your best friend - or at least, frenemy - during the 9 to 5.
This episode is dedicated to business leaders and everyone doing the real work.
It's time to embrace AI and make it work for you. So here are some brilliant ways, tried and tested, to grow yourself, and your business, using AI...
Shownotes at: https://www.wordandmouth.com/how-to-work-with-ai Want to be on the show? Drop me a line at [email protected] -
What just happened?
January 2025: a month of unprecedented change in generative AI.
The open-source community is shaking up the industry, challenging the dominance of tech giants.
This episode of AI Today dives into the key developments, from powerful AI assistants like Convergence's Proxy and Moonshot's Kimi3, to revolutionary image and video generation models like DeepSeek's Janus-Pro and Pika 2.15.
Discover how these innovations can automate tasks, analyse data, and create content, offering businesses new opportunities.
This is a must-listen for every business leader aiming to leverage AI for a competitive edge, and stay ahead of the curve in this rapidly evolving landscape.
-
This is the most personal episode I have ever recorded.
It's my near half-century journey to building an app. And how AI made it possible.
I'm a pre-beginner coder.
Yet eventually - and I mean, eventually, I somehow composed a working prototype of an application.
And I hope that I might inspire someone to give AI coding a shot. And learn the basics of coding - to get the most from this new, and hugely exciting, way to make things we love.
AI tools and models I've used for my working prototype:
Google's Gemini 1.5 Pro Various AI coding extensions for VS Code (Cline, Roo Code, Aider, and Continue) GitHub Copilot Pro using Anthropic's legendary Sonnet 3.5 model Perplexity Pro (with OpenAI o1) for research Deep Seek R1 (try it free) for mindblowing progress Lovable for visual inspiration based on my initial PRD.I almost gave up. Many times. I was literally and successfully prompting AI code editors to death.
But I don't enjoy giving up.
Find out what turned everything around.
And why you, too, should stay the course...
-
Warning: After listening to this podcast, you won't be able to sit still.
The possibilities revealed by AI will have you leaping into action, driving innovation, and revolutionising your business.
Are you ready to unlock the secrets to unprecedented business growth?
Get ready to discover how AI can transform your organisation, skyrocket your efficiency, and catapult you to new heights of success.
Welcome to another crazy episode in the life of AI Today!
-
If you're familiar with the Eisenhower matrix you'll be familiar with businesses and customer research - they simply don't know what they don't know! But thanks to two crucial AI research studies, we're learning things about people that can help us all understand ourselves better.
For more about AI Today, or to guest on the show, email [email protected]
-
The past few weeks in AI have shattered my brain into a billion fragments of wonder. We've even found a new way to do AI, beyond transformers - that could change even what's been the most changeful week in the history of modern artificial intelligence. But for now, since it's Christmas, let's switch the light on - not the bulb, but the lightness rather than all the stuff that's gonna melt our minds. I want our friends to share with you a glimpse into what I'm building next year.
Happy Christmas, everyone!
-
You've heard the pandemonium all about Google launching its fastest and smartest frontier model yet. But what does it mean for your business? And what about Devin - the grown-up AI copilot for your engineering teams, which also launched this week to rapturous applause from anyone seeking speedier shipping and warp-speed progress through product roadmaps...
Your boy's joining our smart silicon sand as a third wheel on today's show. I hope you love the aesthetic...
-
Finally - an overdue appearance from AI Today creator, Dave Thackeray!
What a year it's been. And it's just the beginning.
Join me taking a look at 2024 and the indisputable delights and miracles coming our way from January... -
The last barrier to enterprise adoption of AI was memory. Baking into every prompt what the algorithm needed to know, was enough to send business leaders scurrying for the Luddite hills. But now Google (with Gemini and DeepMind's Mustafa Suleyman) and Microsoft are promising AI with memory. This will change so much about how we use AI - and make things so much more effective, and efficient. Let's get stuck in!
-
Understanding human behaviour is critical to business success.
Behavioural science informs every growth stage and product decision - yet so few businesses pay any attention to human behaviour and psychology.
This new development makes a hugely insightful and practical corpus of psychological and behavioural data useful for everyone.
-
Tired of research and development (R&D) bottlenecks? Today's episode of AI Today explores how AI can supercharge product development by rapidly uncovering game-changing insights from mountains of data and even suggesting testable solutions, accelerating the journey from idea to market.
Discover how AI tools are democratising access to powerful insights, potentially levelling the playing field for smaller companies and fuelling a surge in innovation across all sectors.
-
I've tested 20 AI coding editors. My tech skills are basic, at best. None turned my ideas into apps.
That's when I found Databutton.
And now I'm an app developer.
Listen in to find out how Databutton has given the world's 8 billion inventors a chance to bring their ideas to life...
-
Here at AI Today, we know how to listen.
We spent hours analysing Lenny Rachitsky - host of Lenny's Podcast - interviewing pro prompt engineer Mike Taylor to bring you this deep dive into all the techniques, tools, and tactics to rock your business.
Enjoy this special edition of a very special show...
-
Botto's 15,000 curators are celebrating a big win this week after six of their carefully-chosen, pixel-pushed masterpieces, sold for more than $350,000 at a Sotheby's auction in New York. It's a story that belongs in a museum. Just when we thought it was safe to come out after NFTs' baffling popularity flare-up of the very early 20s, we're here again. At least Beeple created his own art...
-
Imagine if you had massive balls - crystal ones - to accurately forecast future business needs.
That's one of the thousands of ways autonomous agents - popularised in organisations of all sizes through Microsoft Copilot Studio - can build better businesses.
These agents can send reports to your senior leadership team identifying inefficiencies or opportunities across the organisation. Then the HR squad can decide whether to upskill colleagues or hire in new ones. All before shit hits the fan!
Autonomous agents will change everything, Taste the future on today's episode of AI Today!
- Vis mere