Episodes
-
Hey everyone! Thank you so much for watching the 109th episode of the Weaviate Podcast with Erika Cardenas! Erika, in collaboration with Leonie Monigatti, have recently published "What is Agentic RAG". This blog post that was even covered in VentureBeat with additional quotes from Weaviate Co-Founder and CEO Bob van Luijt! This podcast continues the discussion on all things Agentic RAG, covering the basics of Agents, how Agentic RAG changes the game compared to Vanilla RAG systems, Multi-Agent Systems and CrewAI / OpenAI Swarm, Letta, DSPy, and many more! The podcast also anchors by discussing Agentic Generative Feedback Loops and how we are using Agents to improve the quality and expand the capabilities of Generative Feedback Loops!
-
JSON mode has been one of the biggest enablers for working with Large Language Models! JSON mode is even expanding into Multimodal Foundation models! But how exactly is JSON mode achieved?
There are generally 3 paths to JSON mode: (1) constrained generation (such as Outlines), (2) begging the model for a JSON response in the prompt, and (3) A two stage process of generate-then-format.
I am BEYOND EXCITED to publish the 108th Weaviate Podcast with Zhi Rui Tam, the lead author of Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models!
As the title of the paper suggests, although constrained generation is awesome because of its reliability, we may be sacrificing the performance of the LLM by producing our JSON with this method.
The podcast dives into how these experiments identify this and all sorts of details about the potential and implementation details of Structured Outputs. I particularly love the conversation topic of incredible Complex Structured Outputs, such as generating 10 values in a single inference.
I hope you enjoy the podcast! As always please reach out if you would like to discuss any of these ideas further!
-
Missing episodes?
-
Hey everyone! Thank you so much for watching the 107th episode of the Weaviate Podcast! This one dives into SWE-bench, SWE-agent, and most recently SWE-bench Multimodal with John Yang from Stanford University and Carlos E. Jimenez from Princeton University! One of the most impactful applications of AI we have seen so far is in programming and software engineering! John, Carlos, and team are at the cutting-edge of developing and benchmarking these systems! I learned so much from the conversation and I really hope you find it interesting and useful as well!
-
Hey everyone! I am SUPER excited to publish the 106th episode of the Weaviate Podcast featuring Rose E. Wang!! Rose is a Ph.D. student at Stanford University where she has lead incredible research at the cutting-edge of AI applications in Education. The podcast heavily discusses her recent work on Tutor CoPilot! Tutor CoPilot is one of the world's largest randomized control trials on the impact AI is having on education, testing 900 students and 1800 tutors in grades K-12. I think this is such an inspiring study and it is interesting to see the data coming in quantifying the impact AI is having on education. I was amazed by the depth of how Rose things about education and learning strategies and how well she integrates cutting-edge topics in AI! I hope you find the podcast interesting and useful!
-
Hey everyone! Thank you so much for tuning into the 105th episode of the Weaviate Podcast! This one features Philip Kiely diving into all sorts of apsects related to Compound AI Systems! We are now seeing far better results with AI models by breaking up tasks into multiple stages and inferences. Philip explains the work they are doing at Baseten to optimize and scale deployments of these emerging systems and all sorts of aspects about them from Structured Generation to their distinction with Agents! I hope you find it useful!
-
AI Researchers have overfit to maximizing state-of-the-art accuracy at the expense of the cost to run these AI systems! We need to account for cost during optimization. Even if a chatbot can produce an amazing answer, it isn't that valuable if it costs, say $5 per response!I am beyond excited to present the 104th Weaviate Podcast with Sayash Kapoor and Benedikt Stroebl from Princeton Language and Intelligence! Sayash and Benedikt are co-first authors of "AI Agents That Matter"! This is one of my favorite papers I've studied recently which introduces Pareto Optimal optimization to DSPy and really tames the chaos of Agent benchmarking!This was such a fun conversation! I am beyond grateful to have met them both and to feature their research on the Weaviate Podcast! I hope you find it interesting and useful!
-
I am beyond excited to publish our interview with Krista Opsahl-Ong from Stanford University! Krista is the lead author of MIPRO, short for Multi-prompt Instruction Proposal Optimizer, and one of the leading developers and scientists behind DSPy!
This was such a fun discussion beginning with the motivation of Automated Prompt Engineering, Multi-Layer Language Programs (also commonly referred to as Compound AI Systems), and their intersection. We then dove into the details of how MIPRO achieves this and miscellaneous topics in AI from Structured Outputs to Agents, DSPy for Code Generation, and more!
I really hope you enjoy the podcast! As always, more than happy to answer any questions or discuss any ideas about the content in the podcast!
-
AI is completely transforming how we build software! But how exactly? What does it mean for a software application to be AI-Native versus AI-Enabled? How many other aspects of software development and creativity are impacted by AI?
I am super excited to publish our 102nd Weaviate Podcast with Guy Podjarny and Bob van Luijt on AI-Native Development!
Guy Podjarny is a co-founder of Snyk, a remarkably successful Cybersecurity company. He is now back on the founder journey, diving into AI-Native Development with Tessl!
Guy and Bob both have so much expertise in how software is developed and shipped to the world. There are so many interesting nuggets in this from defining AI-Native to Stateful AI, AI-assisted coding, subjectivity in software specification, personalized content, and much more!
I hope you enjoy the podcast, this was a really fun and interesting one!
-
Hey everyone! Thank you so much for watching the 101st episode of the Weaviate Podcast with Devin Petersohn! Devin is the creator of Modin, one of the world's most advanced systems for scaling Pandas! Devin then went onto co-found Ponder, which was acquired by Snowflake in early 2023. This was one of my favorite podcasts of all time, I learned so much about the internals of Data Systems and I hope you do as well!
-
What is an AI-native application? This has been one of the questions we are most interested in answering at Weaviate! This podcast explores this question with Weaviate Co-founder Bob van Luijt and Lucas Negritto. Formerly at OpenAI, Lucas is now building Odapt, a remarkable example of such an application where we no longer use front-end code, rather rendering the UI entirely within the generative model!! There are many interesting topics covered such as of course, firstly how this works and how you build these systems, as well as native multimodality, subjective feedback, and more! I hope you find the podcast interesting and useful!
-
Liana Patel is a Ph.D. student at Stanford University who is the lead author of ACORN, a breakthrough in Approximate Nearest Neighbor Search with Filters! Also joining the podcast is Abdel Rodriguez, a Vector Index Researcher and Engineer at Weaviate. This podcast dives into all sorts of details behind ACORN. Starting with how Liana developed her interest in Approximate Nearest Neighbor Search algorithms and then transitioning into how ACORN differs from previous approaches, the Two-Hop Neighborhood Heuristic, Predicate Subgraphs, Experimental Details, and many more topics! Major thank you to Liana and Abdel for joining the podcast, this was such a fun conversation packed with insights about Proximity Graph algorithms for Vector Search with Filtering!
-
Josh Engels is a Ph.D. student at MIT who has published several works advancing the state of the art in Vector Search. Josh has recently developed the Window Search Tree, a new algorithm particularly targeted for improving Filtered Vector Search. Even more particularly than that, the WST algorithm targets Filtered Search with continuous-valued filters such as "price" or "date", also known as range filters. This is a huge application for Vector Databases and it was incredible getting to pick Josh's brain on how this works and the state of Approximate Nearest Neighbor Search!
-
Hey everyone! I am SUPER excited to publish our 97th Weaviate Podcast on the state of AI-powered Search technology featuring Nils Reimers and Erika Cardenas! Erika and I have been super excited about Cohere's latest works to advance RAG and Search and it was amazing getting to pick Nils' brain about all these topics!
We began with the development of Compass! Nils explains the current problem with embeddings as a soup!! For example, imagine embedding this video description, the first part is about the launch of a podcast, whereas this part is about an embedding algorithm -- how do we form representations of multi-aspect chunks of text?
We dove into all the details of this from the distinction of multi-aspect embeddings with LLM or "smart" chunkers, ColBERT, "Embed Small, Retrieve Big", and many other topics as well from Cross Encoder Re-rankers to Data Cleaning with Generative Feedback Loops, RAG Evaluation, Vector Quantization, and more!
I really hope you enjoy the podcast! It was such an educational experience for Erika and I and we really hope you enjoy it as well!
-
Hey everyone! Thank you so much for watching the 96th episode of the Weaviate podcast featuring Letitia Parcalabescu! While completing her Ph.D. studies at the University of Heidelberg, Letitia started her YouTube channel: AI Coffee Break with Letitia! Her videos break down complex concepts in AI with a creative mix of technical expertise and visualizations unlike anyone else in the space!We began the podcast by discussing our shared background in creating content on YouTube from starting, to plans for the future, and everything else in between!We then discussed the evolution of Deep Learning over the last few years -- from neural network architectures to datasets, tasks, learning algorithms, and more! I think particularly we are at a really interesting time in the future of learning algorithms! We discussed DSPy and new ways of thinking about instruction tuning, example production, gradient descent, and the future of SFT vs. DPO-style techniques!
-
Hey everyone, thank you so much for watching the 95th Weaviate Podcast! We are beyond honored to feature Dai Vu from Google on this one, alongside Weaviate Co-Founder Bob van Luijt! This podcast dives into all things Google Cloud Marketplace and the state of AI. Beginning with the proliferation of Open-Source models and how Dai sees the evolving landscape with respect to things like Gemini Pro 1.5, Gemini Nano and Gemma, as well as the integration of 3rd party model providers such as Llama 3 on Google Cloud platforms such as Vertex AI. Bob and Dai continue to unpack the next move for open-source infrastructure providers and perspectives around "AI-Native" applications, trends in data gravity, perspectives on benchmarking, and Dai's "aha" moment in AI!
-
As you are graduating from ideas to engineering, one of the key concepts to be aware of is Parallel Computing and Concurrency. I am SUPER excited to share our 94th Weaviate podcast with Magdalen Dobson Manohar! Magdalen is one of the most impressive scientists I have ever met, having completed her undergraduate studies at MIT before joining Carnegie Mellon University to study Approximate Nearest Neighbor Search and develop ParlayANN. ParlayANN is one of the most enlightening works I have come across that studies how to build ANN indexes in parallel without the use of locking.
In my opinion, this is the most insightful podcast we have ever produced into Vector Search, the core technology behind Vector Databases. The podcast begins with Magdalen’s journey into ANN science, the issue of Lock Contention in HNSW, further detailing HNSW vs. DiskANN vs. HCNNG and pyNNDescent, ParlayIVF, how Parallel Index Construction is achieved, conclusions from experimentation, Filtered Vector Search, Out of Distribution Vector Search, and exciting directions for the future!
I also want to give a huge thanks to Etienne Dilocker, John Trengrove, Abdel Rodriguez, Asdine El Hrychy, and Zain Hasan. There is no way I would be able to keep up with conversations like this without their leadership and collaboration.
I hope you find the podcast interesting and useful!
-
Hey everyone! I am SUPER excited to publish our newest Weaviate podcast with Kyle Davis, the creator of RAGKit! At a high-level, the podcast covers our understanding of RAG systems through 4 key areas: (1) Ingest / ETL, (2) Search, (3) Generate / Agents, and (4) Evaluation. Discussing these lead to all sorts of topics from Knowledge Graph RAG, to Function Calling and Tool Selection, Re-ranking, Quantization, and many more!This discussion forced me to re-think many of my previously held beliefs about the current RAG stack, particularly the definition of “Agents”. I came in believing that the best way of viewing “Agents” is an abstraction on top of multiple pipelines, such as an “Email Agent”, but Kyle presented the idea of looking at “Agents” as scoping the tools each LLM call is connected to, such as `read_email` or `calculator`. Would love to know what people think about this one, as I think getting a consensus definition of “Agents” can clarify a lot of the current confusion for people building with LLMs / Generative AI.
-
I've seen a lot of interest around RAG for X application domain, Legal, Accounting, Healthcare, .... David and Kevin are maybe the best example of this I have seen so far, pivoting from Neum AI to VetRec!
We begin the podcast by discussing the decision to switch gears, the advice given by Y Combinator, and David's experience in learning a new application domain.
We then continue to discuss technical opportunities around RAG for Veterinarians, such as SOAP notes and Differential Diagnosis!
We conclude with David's thoughts on the ETL space, companies like Unstructured and LlamaIndex's LlamaParse, advice for specific focus in ETL, and general discussions of ETL for Vector DBs / KGs / SQL.
David and Kevin have been two of my favorite entrepreneurs I've met during my time at Weaviate! They do an amazing job of writing content that helps you live vicariously through them as they take on this opportunity to apply RAG and AI technologies to help Veterinarians!
I really hope you enjoy the podcast!
-
Voyage AI is the newest giant in the embedding, reranking, and search model game!
I am SUPER excited to publish our latest Weaviate podcast with Tengyu Ma, Co-Founder of Voyage AI and Assistant Professor at Stanford University!
We began the podcast with a deep dive into everything embedding model training and contrastive learning theory. Tengyu delivered a masterclass in everything from scaling laws to multi-vector representations, neural architectures, representation collapse, data augmentation, semantic similarity, and more! I am beyond impressed with Tengyu's extensive knowledge and explanations of all these topics.
The next chapter dives into a case study Voyage AI did fine-tuning an embedding model for the LangChain documentation. This is an absolutely fascinating example of the role of continual fine-tuning with very new concepts (for example, very few people were talking about chaining together LLM calls 2 years ago), as well as the data efficiency advances in fine-tuning.
We concluded by discussing ML systems challenges in serving an embeddings API. Particularly the challenge of detecting if a request is for batch or query inference and the optimizations that go into either say ~100ms latency for a query embedding or maximizing throughput for batch embeddings.
-
One of the core values of DSPy is the ability to add “reasoning modules” such as Chain-of-Thought to your LLM programs!
For example, Chain-of-Thought describes prompting the LLM with “Let’s think step by step …”. Interestingly, this meta-prompt around asking the LLM to think this way dramatically improves performance in tasks like question answering or document summarization.
Self-Discover is a meta-prompting technique that searches for the optimal thinking primitives to integrate into your program! For example, you could “Let’s think out of the box to arrive at a creative solution” or “Please explain your answer in 4 levels of abstraction: as if you are talking to a five year old, a high school student, a college student studying Computer Science, and a software engineer with years of experience in the topic”.
I am SUPER excited to be publishing our 90th Weaviate Podcast with Chris Dossman! Chris has implemented Self-Discover in DSPy, one of the most fascinating examples so far of what the DSPy framework is capable of!
Chris is also one of the most talented entrepreneurs I have met during my time at Weaviate thanks to introductions from Bob van Luijt and Byron Voorbach. Chris built one of the earliest RAG systems for government information and is now working on LLM opportunities in marketing with his new startup, Dicer.ai!
I hope you enjoy the podcast, it was such a fun one and I learned so much!
- Show more