Episódios
-
Claude 3.5 Sonnet, Anthropic’s newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it’s twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI.
Additional materials: www.superdatascience.com/798
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. -
Dr. Rosanne Liu, Research Scientist at Google DeepMind and co-founder of the ML Collective, shares her journey and the mission to democratize AI research. She explains her pioneering work on intrinsic dimensions in deep learning and the advantages of curiosity-driven research. Jon and Dr. Liu also explore the complexities of understanding powerful AI models, the specifics of character-aware text encoding, and the significant impact of diversity, equity, and inclusion in the ML community. With publications in NeurIPS, ICLR, ICML, and Science, Dr. Liu offers her expertise and vision for the future of machine learning.
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• How the ML Collective came about [03:31]
• The concept of a failure CV [16:12]
• ML Collective research topics [19:03]
• How Dr. Liu's work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs [21:28]
• The pros and cons of curiosity-driven vs. goal-driven ML research [29:08]
• Discussion on Dr. Liu's research and papers [33:17]
• Character-aware vs. character-blind text encoding [54:59]
• The positive impacts of diversity, equity, and inclusion in the ML community [57:51]
Additional materials: www.superdatascience.com/797 -
Estão a faltar episódios?
-
Want to feel optimistic about your day? In this Friday episode, Simon Kuestenmacher talks to Jon Krohn about demography: What it is, why it’s so important, and why its forecasts should give us reason to hope for a better future. In an increasingly globalized world, and with an aging population in countries with the biggest GDPs, demography is more valuable than ever.
Additional materials: www.superdatascience.com/796
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. -
Gina Guillaume-Joseph talks to Jon Krohn about the data and regulatory frameworks set to transform the AI industry and why that’s important to anyone working with data. This episode offers a solid path to understanding AI regulation’s past, present and future. Gina walks listeners through the AI Bill of Rights, the NIST AI Risk Framework and the MITRE ATLAS threat model.
This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), by Crawlbase (crawlbase.com), the ultimate data crawling platform, and by Babbel (https://www.babbel.com/superdata), the science-backed language-learning platform. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• What “responsible AI” means [08:14]
• Why the federal government should be behind AI regulation [12:22]
• The US vs EU on AI regulation [18:46]
• About the AI Bill of Rights [26:14]
• About MITRE and the MITRE Atlas [37:19]
• What a systems engineer does [54:11]
Additional materials: www.superdatascience.com/795 -
Trends in open-source AI: Join Jon Krohn and a panel of data science icons as they discuss the most exciting and concerning developments in open-source AI. Hear insights from Drew Conway, Jared Lander, Emily Zabor, and JD Long on the transformative potential of AI and its future impact.
Additional materials: www.superdatascience.com/794
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information. -
Bayesian methods take the spotlight in this episode with Alex Andorra, co-founder of PyMC Labs, and Jon Krohn. Learn how Bayesian techniques handle tough problems, make the most of prior knowledge, and work wonders with limited data. Alex and Jon break down essentials like PyMC, PyStan, and NumPyro libraries, show how to boost model efficiency with PyTensor, and talk about using ArviZ for top-notch diagnostics and visualizations. Plus, get into advanced modeling with Gaussian Processes.
This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• Practical introduction to Bayesian statistics [04:54]
• Definition and significance of epistemology [17:52]
• Explanation of PyMC and Monte Carlo methods [27:57]
• How to get started with Bayesian modeling and PyMC [34:26]
• PyMC Labs and its consulting services [50:50]
• ArviZ for post-modeling diagnostics and visualization [01:02:23]
• Gaussian processes and their applications [01:09:02]
Additional materials: www.superdatascience.com/793 -
Jon Krohn shares his favorite clips from May. Hear how Navdeep Martin is spearheading a company to tackle the climate crisis, why Sol Rashidi and Demetrios Brinkmann find nailing job titles so necessary in the fast-paced industries of tech and AI, and get the latest on embeddings with Luis Serrano.
Additional materials: www.superdatascience.com/792
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• Why it is important that AI is open [03:13]
• The efficacy and scalability of direct preference optimization [07:32]
• Robotics and LLMs [14:32]
• The challenges to aligning reward models with human preferences [23:00]
• How to make sure AI’s decision making on preferences reflect desirable behavior [28:52]
• Why Nathan believes AI is closer to alchemy than science [37:38]
Additional materials: www.superdatascience.com/791 -
The experts reveal their top open-source R libraries with us live from the New York R Conference! This Super Data Science Podcast episode features an exclusive panel with data science trailblazers Drew Conway, Jared Lander, Emily Zabor, and JD Long. They share their favorite R libraries and valuable insights to enhance your data science practice.
Additional materials: www.superdatascience.com/790
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Machine Learning for Wind Energy is front and center in this episode as Jon Krohn is joined by Dr. Jason Yosinski, CEO of Windscape AI. Dr. Yosinski brings to light the latest ML advancements sparking significant changes in renewable energy. Tune in for a comprehensive review of these cutting-edge technologies and their expansive impact on the industry and the environment's well-being.
This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• Enhancing predictability in wind energy with ML [04:52]
• Data utilization from wind turbines by energy providers [11:41]
• Jason's journey into wind energy [17:55]
• Landing the right startup idea [22:47]
• Visualizing neural networks with the Deep Vis Toolbox [31:29]
• Extreme event forecasting at Uber vs. nowcasting at Windscape AI [45:13]
• Discoveries from Loss Change Allocation research [47:48]
• Engaging with Jason's ML Collective [59:46]
• Traits of successful AI entrepreneurs [1:10:26]
Additional materials: www.superdatascience.com/789 -
Multi-agent systems could mark a significant turning point in generative AI. From mastering increasingly complex tasks to getting LLMs to collaborate, in this Five-Minute Friday, Jon Krohn discusses the systems that are working to bridge the remaining gaps left by the latest large language models (LLMs).
Additional materials: www.superdatascience.com/788
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
MLOps, how to build an online community, and tools for scaling LLMs: In this episode, Demetrios Brinkmann speaks to Jon Krohn about the similarities and differences between LLMOps, MLOps and DevOps, and why this should matter to companies looking to hire such engineers. You will also hear how to get involved in the MLOps community wherever you are in the world, and how you can start developing great products with the available tools.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• What MLOps is [03:51]
• About LLMOps [12:06]
• About LlamaIndex and Ollama [18:29]
• Insights from Demetrios’ MLOps survey [20:49]
• Guidance for using third-party APIs [40:18]
• Recommendations for building an online community in tech and AI [47:07]
Additional materials: www.superdatascience.com/787 -
Learn about the six keys to data science success as host Jon Krohn welcomes back Kirill Eremenko, the mastermind behind SuperDataScience. Kirill shares his top insights on data science careers, from building strong portfolios to leveraging mentors and hands-on labs. With over 2.7 million students, his advice is a must-hear for aspiring and experienced data scientists alike.
Additional materials: www.superdatascience.com/786
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Dr. Luis Serrano from the Serrano Academy reveals how to make Math and Quantum ML accessible, tackles the challenges of teaching A.I. to beginners, and explores the power of embeddings in enterprise applications. Explore the future of Quantum Machine Learning and the latest trends in AI, including multimodality and autonomous systems.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• How math and AI can be made easy to understand [05:21]
• The three major categories of learners [16:21]
• Why embeddings are the most important component of LLMs [26:19]
• How semantic search differs from a traditional keyword search [29:57]
• The most exciting emerging application areas for AI [42:41]
• The promising application areas for Quantum Machine Learning [49:18]
Additional materials: www.superdatascience.com/785 -
Aligning LLMs: How can we teach pre-trained LLMs to hold a conversation and learn new information from each other? This was where Sinan Ozdemir began his investigation into aligning LLMs. In this episode, he talks to Jon Krohn about the limitations of definitions for LLMs, training LLMs, and whether it is possible to train an LLM without alignment.
Additional materials: www.superdatascience.com/784
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Recent advances in GenAI, how to tackle the climate crisis with advanced technology, and addressing the knowledge gap in understanding AI: Jon Krohn speaks to Flypower co-founder and CEO Navdeep Martin about the advances made in GenAI, from products to applications, and how we might use AI to tackle climate change.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• How the Washington Post’s recommendation systems work [03:29]
• Why product leaders make great CEOs [10:36]
• How Flypower uses GenAI to tackle climate change [22:13]
• How Flypower identifies its customers’ most pertinent questions [30:03]
• How AI might come to tackle climate change [36:52]
• How to mitigate hallucination in AI models [41:04]
Additional materials: www.superdatascience.com/783 -
Hear Jon Krohn’s favorite five clips from his April interviews. Chief Scientist at Posit PBC Hadley Wickham on the subtle differences between Python and R. Professor of Business Analytics Barrett Thomas walks through the variables that companies should consider when using drones or any other tech to improve their business operations and bottom line. Aleksa Gordić, Founder of Runa AI believes an overhaul of the current educational system is long overdue. Bernard Marr discusses the future of GenAI and its impact on the world of work. And SuperDataScience founder Kirill Eremenko gives a lively workshop on gradient boosting.
Additional materials: www.superdatascience.com/782
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Sol Rashidi, a distinguished data executive who has served in C-suite roles at Fortune 100 companies, joins Jon Krohn to delve into successful enterprise AI strategies and the reasons behind the high turnover among Chief Data Officers. This episode provides an in-depth look at selecting AI projects that succeed and understanding the strategic value of patents in various industries. Benefit from Sol’s extensive experience and practical advice on navigating complex corporate challenges.
This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• How CDOs and related roles have such high turnover because [09:40]
• The importance of building relationships in AI projects [17:01]
• How Sol's book "The AI Survival Guide" came about [20:44]
• How high-criticality, low-complexity AI projects are the ones with the highest probability of success [27:11]
• How Enterprise data security issues can be resolved with technologies like Protopia’s stained-glass data-masking solution [36:10]
• Why having great data engineers is essential [47:57]
• The value of patents [51:45]
Additional materials: www.superdatascience.com/781 -
Want to become a data scientist? Jon and Adam discuss the key steps to becoming a data scientist, with a focus on developing portfolio projects. Hear about the 10 project ideas Adam recommends in his book to help you stand out in the data science community.
Additional materials: www.superdatascience.com/780
Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. -
Tidyverse, ggplot2, and the secret to a tech company’s longevity: Hadley Wickham talks to Jon Krohn about Posit’s rebrand, Tidyverse and why it needs to be in every data scientist’s toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career.
This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel). Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
In this episode you will learn:
• All about the Tidyverse [04:46]
• Hadley’s favorite R libraries [17:10]
• The goal of Posit [30:29]
• On bringing multiple programming languages together [36:02]
• The principles for a long-lasting tech company [52:10]
• How Hadley developed ggplot2 [55:24]
• How to contribute to the open-source community [1:05:43]
Additional materials: www.superdatascience.com/779 - Mostrar mais