Episoder

  • AI Researchers have overfit to maximizing state-of-the-art accuracy at the expense of the cost to run these AI systems! We need to account for cost during optimization. Even if a chatbot can produce an amazing answer, it isn't that valuable if it costs, say $5 per response!I am beyond excited to present the 104th Weaviate Podcast with Sayash Kapoor and Benedikt Stroebl from Princeton Language and Intelligence! Sayash and Benedikt are co-first authors of "AI Agents That Matter"! This is one of my favorite papers I've studied recently which introduces Pareto Optimal optimization to DSPy and really tames the chaos of Agent benchmarking!This was such a fun conversation! I am beyond grateful to have met them both and to feature their research on the Weaviate Podcast! I hope you find it interesting and useful!

  • I am beyond excited to publish our interview with Krista Opsahl-Ong from Stanford University! Krista is the lead author of MIPRO, short for Multi-prompt Instruction Proposal Optimizer, and one of the leading developers and scientists behind DSPy!

    This was such a fun discussion beginning with the motivation of Automated Prompt Engineering, Multi-Layer Language Programs (also commonly referred to as Compound AI Systems), and their intersection. We then dove into the details of how MIPRO achieves this and miscellaneous topics in AI from Structured Outputs to Agents, DSPy for Code Generation, and more!

    I really hope you enjoy the podcast! As always, more than happy to answer any questions or discuss any ideas about the content in the podcast!

  • Mangler du episoder?

    Klikk her for å oppdatere manuelt.

  • AI is completely transforming how we build software! But how exactly? What does it mean for a software application to be AI-Native versus AI-Enabled? How many other aspects of software development and creativity are impacted by AI?

    I am super excited to publish our 102nd Weaviate Podcast with Guy Podjarny and Bob van Luijt on AI-Native Development!

    Guy Podjarny is a co-founder of Snyk, a remarkably successful Cybersecurity company. He is now back on the founder journey, diving into AI-Native Development with Tessl!

    Guy and Bob both have so much expertise in how software is developed and shipped to the world. There are so many interesting nuggets in this from defining AI-Native to Stateful AI, AI-assisted coding, subjectivity in software specification, personalized content, and much more!

    I hope you enjoy the podcast, this was a really fun and interesting one!

  • Hey everyone! Thank you so much for watching the 101st episode of the Weaviate Podcast with Devin Petersohn! Devin is the creator of Modin, one of the world's most advanced systems for scaling Pandas! Devin then went onto co-found Ponder, which was acquired by Snowflake in early 2023. This was one of my favorite podcasts of all time, I learned so much about the internals of Data Systems and I hope you do as well!

  • What is an AI-native application? This has been one of the questions we are most interested in answering at Weaviate! This podcast explores this question with Weaviate Co-founder Bob van Luijt and Lucas Negritto. Formerly at OpenAI, Lucas is now building Odapt, a remarkable example of such an application where we no longer use front-end code, rather rendering the UI entirely within the generative model!! There are many interesting topics covered such as of course, firstly how this works and how you build these systems, as well as native multimodality, subjective feedback, and more! I hope you find the podcast interesting and useful!

  • Liana Patel is a Ph.D. student at Stanford University who is the lead author of ACORN, a breakthrough in Approximate Nearest Neighbor Search with Filters! Also joining the podcast is Abdel Rodriguez, a Vector Index Researcher and Engineer at Weaviate. This podcast dives into all sorts of details behind ACORN. Starting with how Liana developed her interest in Approximate Nearest Neighbor Search algorithms and then transitioning into how ACORN differs from previous approaches, the Two-Hop Neighborhood Heuristic, Predicate Subgraphs, Experimental Details, and many more topics! Major thank you to Liana and Abdel for joining the podcast, this was such a fun conversation packed with insights about Proximity Graph algorithms for Vector Search with Filtering!

  • Josh Engels is a Ph.D. student at MIT who has published several works advancing the state of the art in Vector Search. Josh has recently developed the Window Search Tree, a new algorithm particularly targeted for improving Filtered Vector Search. Even more particularly than that, the WST algorithm targets Filtered Search with continuous-valued filters such as "price" or "date", also known as range filters. This is a huge application for Vector Databases and it was incredible getting to pick Josh's brain on how this works and the state of Approximate Nearest Neighbor Search!

  • Hey everyone! I am SUPER excited to publish our 97th Weaviate Podcast on the state of AI-powered Search technology featuring Nils Reimers and Erika Cardenas! Erika and I have been super excited about Cohere's latest works to advance RAG and Search and it was amazing getting to pick Nils' brain about all these topics!

    We began with the development of Compass! Nils explains the current problem with embeddings as a soup!! For example, imagine embedding this video description, the first part is about the launch of a podcast, whereas this part is about an embedding algorithm -- how do we form representations of multi-aspect chunks of text?

    We dove into all the details of this from the distinction of multi-aspect embeddings with LLM or "smart" chunkers, ColBERT, "Embed Small, Retrieve Big", and many other topics as well from Cross Encoder Re-rankers to Data Cleaning with Generative Feedback Loops, RAG Evaluation, Vector Quantization, and more!

    I really hope you enjoy the podcast! It was such an educational experience for Erika and I and we really hope you enjoy it as well!

  • Hey everyone! Thank you so much for watching the 96th episode of the Weaviate podcast featuring Letitia Parcalabescu! While completing her Ph.D. studies at the University of Heidelberg, Letitia started her YouTube channel: AI Coffee Break with Letitia! Her videos break down complex concepts in AI with a creative mix of technical expertise and visualizations unlike anyone else in the space!We began the podcast by discussing our shared background in creating content on YouTube from starting, to plans for the future, and everything else in between!We then discussed the evolution of Deep Learning over the last few years -- from neural network architectures to datasets, tasks, learning algorithms, and more! I think particularly we are at a really interesting time in the future of learning algorithms! We discussed DSPy and new ways of thinking about instruction tuning, example production, gradient descent, and the future of SFT vs. DPO-style techniques!

  • Hey everyone, thank you so much for watching the 95th Weaviate Podcast! We are beyond honored to feature Dai Vu from Google on this one, alongside Weaviate Co-Founder Bob van Luijt! This podcast dives into all things Google Cloud Marketplace and the state of AI. Beginning with the proliferation of Open-Source models and how Dai sees the evolving landscape with respect to things like Gemini Pro 1.5, Gemini Nano and Gemma, as well as the integration of 3rd party model providers such as Llama 3 on Google Cloud platforms such as Vertex AI. Bob and Dai continue to unpack the next move for open-source infrastructure providers and perspectives around "AI-Native" applications, trends in data gravity, perspectives on benchmarking, and Dai's "aha" moment in AI!

  • As you are graduating from ideas to engineering, one of the key concepts to be aware of is Parallel Computing and Concurrency. I am SUPER excited to share our 94th Weaviate podcast with Magdalen Dobson Manohar! Magdalen is one of the most impressive scientists I have ever met, having completed her undergraduate studies at MIT before joining Carnegie Mellon University to study Approximate Nearest Neighbor Search and develop ParlayANN. ParlayANN is one of the most enlightening works I have come across that studies how to build ANN indexes in parallel without the use of locking.

    In my opinion, this is the most insightful podcast we have ever produced into Vector Search, the core technology behind Vector Databases. The podcast begins with Magdalen’s journey into ANN science, the issue of Lock Contention in HNSW, further detailing HNSW vs. DiskANN vs. HCNNG and pyNNDescent, ParlayIVF, how Parallel Index Construction is achieved, conclusions from experimentation, Filtered Vector Search, Out of Distribution Vector Search, and exciting directions for the future!

    I also want to give a huge thanks to Etienne Dilocker, John Trengrove, Abdel Rodriguez, Asdine El Hrychy, and Zain Hasan. There is no way I would be able to keep up with conversations like this without their leadership and collaboration.

    I hope you find the podcast interesting and useful!

  • Hey everyone! I am SUPER excited to publish our newest Weaviate podcast with Kyle Davis, the creator of RAGKit! At a high-level, the podcast covers our understanding of RAG systems through 4 key areas: (1) Ingest / ETL, (2) Search, (3) Generate / Agents, and (4) Evaluation. Discussing these lead to all sorts of topics from Knowledge Graph RAG, to Function Calling and Tool Selection, Re-ranking, Quantization, and many more!This discussion forced me to re-think many of my previously held beliefs about the current RAG stack, particularly the definition of “Agents”. I came in believing that the best way of viewing “Agents” is an abstraction on top of multiple pipelines, such as an “Email Agent”, but Kyle presented the idea of looking at “Agents” as scoping the tools each LLM call is connected to, such as `read_email` or `calculator`. Would love to know what people think about this one, as I think getting a consensus definition of “Agents” can clarify a lot of the current confusion for people building with LLMs / Generative AI.

  • I've seen a lot of interest around RAG for X application domain, Legal, Accounting, Healthcare, .... David and Kevin are maybe the best example of this I have seen so far, pivoting from Neum AI to VetRec!

    We begin the podcast by discussing the decision to switch gears, the advice given by Y Combinator, and David's experience in learning a new application domain.

    We then continue to discuss technical opportunities around RAG for Veterinarians, such as SOAP notes and Differential Diagnosis!

    We conclude with David's thoughts on the ETL space, companies like Unstructured and LlamaIndex's LlamaParse, advice for specific focus in ETL, and general discussions of ETL for Vector DBs / KGs / SQL.

    David and Kevin have been two of my favorite entrepreneurs I've met during my time at Weaviate! They do an amazing job of writing content that helps you live vicariously through them as they take on this opportunity to apply RAG and AI technologies to help Veterinarians!

    I really hope you enjoy the podcast!

  • Voyage AI is the newest giant in the embedding, reranking, and search model game!

    I am SUPER excited to publish our latest Weaviate podcast with Tengyu Ma, Co-Founder of Voyage AI and Assistant Professor at Stanford University!

    We began the podcast with a deep dive into everything embedding model training and contrastive learning theory. Tengyu delivered a masterclass in everything from scaling laws to multi-vector representations, neural architectures, representation collapse, data augmentation, semantic similarity, and more! I am beyond impressed with Tengyu's extensive knowledge and explanations of all these topics.

    The next chapter dives into a case study Voyage AI did fine-tuning an embedding model for the LangChain documentation. This is an absolutely fascinating example of the role of continual fine-tuning with very new concepts (for example, very few people were talking about chaining together LLM calls 2 years ago), as well as the data efficiency advances in fine-tuning.

    We concluded by discussing ML systems challenges in serving an embeddings API. Particularly the challenge of detecting if a request is for batch or query inference and the optimizations that go into either say ~100ms latency for a query embedding or maximizing throughput for batch embeddings.

  • One of the core values of DSPy is the ability to add “reasoning modules” such as Chain-of-Thought to your LLM programs!

    For example, Chain-of-Thought describes prompting the LLM with “Let’s think step by step …”. Interestingly, this meta-prompt around asking the LLM to think this way dramatically improves performance in tasks like question answering or document summarization.

    Self-Discover is a meta-prompting technique that searches for the optimal thinking primitives to integrate into your program! For example, you could “Let’s think out of the box to arrive at a creative solution” or “Please explain your answer in 4 levels of abstraction: as if you are talking to a five year old, a high school student, a college student studying Computer Science, and a software engineer with years of experience in the topic”.

    I am SUPER excited to be publishing our 90th Weaviate Podcast with Chris Dossman! Chris has implemented Self-Discover in DSPy, one of the most fascinating examples so far of what the DSPy framework is capable of!

    Chris is also one of the most talented entrepreneurs I have met during my time at Weaviate thanks to introductions from Bob van Luijt and Byron Voorbach. Chris built one of the earliest RAG systems for government information and is now working on LLM opportunities in marketing with his new startup, Dicer.ai!

    I hope you enjoy the podcast, it was such a fun one and I learned so much!

  • Hey everyone! Thank you so much for watching the 89th Weaviate Podcast on Matryoshka Representation Learning! I am beyond grateful to be joined by the lead author of Matryoshka Representation Learning, Aditya Kusupati, Zach Nussbaum, a Machine Learning Engineer at Nomic AI bringing these embeddings to production, and my Weaviate colleague, Zain Hasan, who has done amazing research on Matryoshka Embeddings! We think this is a super powerful development for Vector Search! This podcast covers all sorts of details from generally what Matryoshka embeddings are, the challenges of training them, experiences building an embeddings API product from Nomic AI and how it ties with Nomic Atlas, Aditya's research on differentiable ANN indexes, and many more! This was such a fun one, I really hope you find it useful! Please let us know what you think!

  • Jason Liu is the creator of Instructor, one of the world's leading LLM frameworks, particularly focused on structured output parsing with LLMs, or as Jason puts it "making LLMs more backwards compatible". It is hard to understand the impact of Instructor, this is truly leading us to the next era of LLM programming. It was such an honor chatting with Jason, his experience currently as an independent consultant and previously engineering at StitchFix and Meta makes him truly one of the most unique guests we have featured on the Weaviate podcast! I hope you enjoy the podcast!

  • Hey everyone! Thank you so much for watching the 87th episode of the Weaviate Podcast! I am SUPER excited to welcome Karel D'Oosterlinck! Karel is the creator of IReRa (Infer-Retrieve-Rank)! IReRa is one of the most impressive systems that have been built for Extreme Multi-Label Classification, leveraging the emerging paradigm of DSPy compilation! This podcast dives into all things IReRa, XMC, DSPy compilation, and applications in Biomedical NLP and Recommendation! I hope you find this useful!

  • Hey everyone! We are super excited to publish this podcast with Vinod Valloppillil and Bob van Luijt on Open-Source AI and future directions for RAG! The podcast begins by discussing Vinod's "Halloween Documents", a series of internal strategy writings at Microsoft related to the open-source software movement! The conversation continues to discuss the current state of Open-Source in AI. One of the major points Bob has been making about the business of AI models is that the models themselves are *stateless*, akin to an MP3 file. Vinod pushes back a bit on this definition and jointly it is then settled that these models neither fall into the pure stateful or stateless bucket, rather a "pre-baked" bucket -- presenting completely new opportunities to build business around software. The conversation then continues to discuss the particular details of how people are building RAG systems and many directions for how that may evolve!

  • Hey everyone! I am beyond excited to present our interview with Omar Khattab from Stanford University! Omar is one of the world's leading scientists on AI and NLP. I highly recommend you check out Omar's remarkable list of publications linked below! This interview completely transformed my understanding of building RAG and LLM applications! I believe that DSPy will be one of the most impactful software project in LLM development because of the abstractions around *program optimization*. Here is my TLDR of this concept of LLM programs and program optimization with DSPy, I of course encourage you to view the podcast and listen to Omar's explanation haha.RAG is one of the most popular LLM programs we have seen. RAG typically consists of two components of retrieve and then generate. Within the generate component we have a prompt like "please ground your answer based on the search results {search_results}". DSPy gives us a framework to optimize this prompt, bootstrap few-shot examples, or even fine-tune the model if needed. This works by compiling the program based on some evaluation criteria we give DSPy. Now let's say we add a query re-writer that takes the query and writes a new query before sending it to the retrieval system, and a reranker that takes the search results and re-orders them before handing them to the answer generator. Now we have 4 components of query writer, retrieve, rerank, answer. The 3 components of query writer, rerank, and answer all have a prompt that can be optimized with DSPy to enhance the description of the task or add examples! This optimization is done with DSPy's Teleprompters.There are a few other really interesting components to DSPy as well -- such as the formatting of prompts with the docstrings and Signature abstraction, which in my view is quite similar to instructor or LMQL. DSPy also comes with built-in prompts like Chain-of-Thought that offer a really quick way to add this reasoning step and follow a structured output format. I am having so much fun learning about DSPy and I highly recommend you join me in viewing the GitHub repository linked below (with new examples!!):Omar also discusses ColBERT and late interaction retrieval! Omar describes how this achieves the contextualized attention of cross encoders but in a much more scalable system with the maximum similarity between vectors! Stay tuned for more updates from Weaviate as we are diving into multi vector representations to hopefully support systems like this soon!

    Chapters

    0:00 Weaviate at NeurIPS 2023!

    0:38 Omar Khattab

    0:57 What is the state of AI?

    2:35 DSPy

    10:37 Pipelines

    14:24 Prompt Tuning and Optimization

    18:12 Models for Specific Tasks

    21:44 LLM Compiler

    23:32 Colbert or ColBERT?

    24:02 ColBERT