Episodit

  • In the latest AI Paper Club podcast, hosts Rafael Herrera and Sonia Marques are joined by João Costa, Senior Machine Learning Software Engineer at Deeper Insights. Together, they explore the paper “Diffusion Models are Real-Time Game Engines,” produced by researchers at Google. This episode delves into the intriguing evolution of AI as it replicates the iconic game Doom using stable diffusion—an AI model typically associated with image generation.

    The team discusses the paper’s innovative methodology, detailing how stable diffusion models were adapted to generate frame-by-frame gameplay, capturing Doom’s game logic through AI. João unpacks the technical nuances behind the real-time generation of 20 frames per second using powerful TPU processors and explores the research’s practical applications and limitations.

    We also extend a special thank you to the Google DeepMind team for developing this month's paper. If you are interested in reading the paper yourself, please visit this link: https://gamengen.github.io.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques sit down with senior machine learning engineer Bernardo Ramos from Deeper Insights. Together, they explore the classic 2015 paper "Hidden Technical Debt in Machine Learning Systems". The paper highlights the often-overlooked issue of technical debt in machine learning projects and how it silently accumulates over time, much like financial debt.

    The discussion delves into the nuances of technical debt, particularly how data dependencies differ from code dependencies and why they are harder to detect. The podcast also covers unstable data signals, feedback loops, and the unique challenges faced by large language models (LLMs) in today's data-driven world. Bernardo shares potential mitigation strategies to help manage these technical debts effectively.

    A special thank you to the authors D. Sculley, G. Holt, D. Golovin, and their team for developing this month's paper. If you are interested in reading the paper yourself, please visit this link: https://dl.acm.org/doi/10.5555/2969442.2969519.

    For more information on artificial intelligence, machine learning, and engineering solutions for your business, please visit www.deeperinsights.com or contact us at [email protected].

  • Puuttuva jakso?

    Paina tästä ja päivitä feedi.

  • This month’s episode of the AI Paper Club Podcast welcomes Dr. Diogo Ribeiro, a senior machine learning engineer at Deeper Insights. Diogo presents a research paper he co-developed, focusing on the industrial application of AI, titled "Isolation Forest and Deep Autoencoders for Industrial Screw Tightening Anomaly Detection." The podcast explores the intricacies of combining traditional machine learning models with deep learning techniques to address a critical problem in industrial manufacturing: detecting anomalies in screw tightening processes.

    The conversation highlights the importance of explainability in AI, particularly in industrial settings where safety and cost are paramount. The episode also touches on the broader implications of machine learning in AI, contrasting it with the current excitement surrounding generative AI models.

    We also extend a special thank you to Diogo and his team of researchers for developing this month's paper. If you are interested in reading the paper yourself, please visit this link: https://www.mdpi.com/2073-431X/11/4/54.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In the latest AI Paper Club Podcast episode, hosts Rafael Herrera and Sonia Marques welcome Dr. Catarina Carvalho, Senior Data Scientist and Computer Vision SME from Deeper Insights, to discuss, "Match Time: Towards Automatic Soccer Game Commentary Generation". This paper introduces a method for generating engaging soccer commentary from video sequences using advanced AI techniques.

    The episode explores the importance of data quality, the innovative pipeline for dataset curation, and the perceiver-like architecture ensuring temporal coherence. It also covers broader applications, such as in cooking shows or assisting the hearing impaired. Tune in to discover how AI is revolutionising sports commentary and how you can try these techniques at home.

    We also extend a special thank you to the research teams from Shanghai University and Shanghai AI Laboratory for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/abs/2406.18530

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques are joined by Andrew Eaton, an AI Solutions Consultant from Deeper Insights, to explore Meta’s latest paper, “Chameleon: Mixed Modal Early Fusion Foundation Models.” This paper marks Meta’s first steps into the mixed modal AI space, combining text, images, and other data types from the start for a more integrated understanding.

    The podcast explores how, unlike traditional models that process text and images separately before combining them, Chameleon integrates these modalities right from the beginning. This early fusion method promises enhanced performance in tasks like image captioning and interleaved text-image outputs, setting new benchmarks in the field.

    We also extend a special thank you to the research team at Meta for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/abs/2405.09818.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome Leticia Fernandes, a Deeper Insights Senior Data Scientist and Generative AI Ambassador. Together, they explore the groundbreaking "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" paper from Google. This paper addresses the challenge of fitting infinite context into large language models, introducing the Infini-attention method. The trio discusses how this approach works, including how it uses linear attention and employs compressive memory to store key-value pairs, enabling models to handle extensive contexts.

    We also extend a special thank you to the research team Google for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2404.07143.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].


  • This month's episode of the AI Paper Club Podcast covers AI Music! Hosts Rafael and Sonia welcome Deeper Insights Data Scientist Joan Rosello to discuss the paper "Simple and Controllable Music Generation" from Meta AI. He introduces us to Meta AI's model that not only generates audio from text prompts but also introduces a novel feature—melody conditioning. This allows the creation of music that adheres to a provided melody, pushing the boundaries of AI in music generation. We also explore the technical concepts involved in creating AI music today, including residual vector quantization used in this model.

    We also extend a special thank you to the research teams at Meta AI for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2306.05284.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this month’s episode of the AI Paper Club Podcast, hosts Rafael Herrera and Sonia Marques welcome special guest Jupiter Angulo from Google. He helps us explore the innovations behind Gemini, its applications in enterprise settings, and Google's latest advancements in multimodal models.

    Developed by Google's think tank, DeepMind, Gemini represents a significant leap forward, offering efficient, scalable, and cost-effective solutions for both enterprise and consumer applications. With Jupiter's insights, we get a behind-the-scenes look at the challenges and opportunities presented by AI integration across various enterprise sectors, exploring the model's versatility and Google's commitment to responsible deployment.

    We also extend a special thank you to the research teams at Google’s DeepMind for developing this month's paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/abs/2312.11805.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this episode of the AI Paper Club Podcast, hosts Rafael Herrera and Marcia Oliveira, alongside special guest Senior Data Scientist, Matt Kidd, delve into the thought-provoking realm of generative AI and copyright law. They explore the innovative paper "Talkin’ ‘Bout AI Generation: Copyright and the Generative-AI Supply Chain" from Cornell Law University, bridging computer science and law to unravel the complex interplay between cutting-edge AI technologies and legal frameworks.

    This discussion not only illuminates the challenges posed by AI-generated content in the legal domain but also sparks a crucial conversation about the future of intellectual property rights in the digital age. The episode deeply explores how generative AI is reshaping our understanding of creativity, authorship, and copyright in a rapidly evolving technological landscape.

    We also send a huge thank you to the team at Cornell Law University for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4523551

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].


  • Join us for a unique episode of the AI Paper Club Podcast, where your hosts Sonia Marques and Rafael Herrera, together with AI experts Dr. Catarina Carvalho and Leticia Fernandes from Deeper Insights, explore the landmark advancements in AI over the past year. This episode takes a retrospective look at 2023's most influential AI papers and technologies, discussing their practical applications and future implications.

    Spanning from Meta's innovative 'Segment Anything' technology to the advanced retrieval-augmented generation methods, stable diffusion inpainting, and the versatile ChatGPT in natural language processing, our guests offer profound insights. They delve into how these groundbreaking innovations are reshaping various industries and transforming everyday life. As a bonus, the group also discusses AI technologies they are looking forward to in the coming year.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • This month’s episode of the Paper Club Podcast, host Rafael Herrera is joined by special guest co-host Dr. Tom Heseltine, Deeper Insights’ Chief Technical Officer. On this special show, we are talking to Professor Simon Prince, an authority in deep learning, who literally wrote a book on the subject. We discuss his newly released book “Understanding Deep Learning” while delving into the complex world of deep learning, exploring its mechanisms, the phenomenon of double descent, and the balancing act of network depth and capacity. The discussion also touches on AI ethics, the impact of AI advancements, and the responsibilities of engineers in this rapidly evolving field.

    We're grateful to have had Simon Prince on our podcast. To discover more about his work and insights, visit his platforms:

    Purchase 'Understanding Deep Learning' by Simon Prince wherever books are sold:
    https://mitpress.mit.edu/9780262048644/understanding-deep-learning/

    For student and instructor resources, including Python Notebooks, visit his website:
    www.udlbook.com

    Follow him on X:
    @SimonPrinceAI

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • Join us on the Paper Club Podcast where our hosts, Rafael Herrera and Marcia Oliveira, delve into the cutting-edge world of data science. This episode features Sonia Marques, a seasoned data scientist and Generative AI Ambassador from Deeper Insights, as they explore the transformative paper "Vision Transformers Need Registers'' from the FAIR team at META and the INRIA research group in France.

    The podcast examines the intricacies of vision transformers, traditionally used in natural language processing, now making waves in computer vision. The discussion illuminates the paper's innovative analysis of how these transformers handle complex visual data, revealing some of the processes that occur in the AI black box.
    We also extend a special thank you to the research teams at FAIR, Meta, and INRIA for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2309.16588.pdf.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this month's episode of the Paper Club Podcast, your hosts Rafael Herrera and Sónia Marques welcome a special guest, Joana Rocha, a PhD candidate at the University of Porto's Engineering School. Joana joins us to discuss her seminal paper, 'Attention Driven Spatial Transformer Network for Abnormality Detection in Chest X-ray Images,' for which she is the lead author.

    The paper introduces a groundbreaking approach to computer-aided diagnosis in the medical field, specifically focusing on chest X-ray images. Unlike traditional methods that often require two separate models—one for selecting the thoracic region and another for the actual classification of abnormalities—the paper presents an end-to-end architecture that simplifies this process.

    During the podcast, we explore the challenges and necessities of implementing AI models in healthcare. Joana's model addresses these by not only improving diagnostic performance but also offering insights into what the model is focusing on. This is crucial in a clinical setting, where understanding the model's decision-making process can be a matter of life and death. The model's ability to focus on the relevant anatomical features ensures that it gains the trust of medical professionals, a critical factor for its effective implementation in clinical practice.

    We thank Joana and her co-authors for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://ieeexplore.ieee.org/abstract/document/9867115

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • On this month's episode of the Paper Club Podcast, hosts Rafael Herrera and Marcia Oliveira, welcome Joan Rossello, data scientist at Deeper Insights. The focus of the discussion is the paper "ImageBind: One Embedding Space To Bind Them All", published by the MetaAI Research team, which introduces a revolutionary approach to multimodal learning representation. ImageBind is the first AI model capable of binding data from six modalities at once, without the need for explicit supervision, and is part of Meta’s efforts to create multimodal AI systems that learn from all possible types of data around them.

    The paper presents a methodology for learning a unified embedding across various data modalities, such as images, text, audio, depth, thermal, and IMU data. The podcast discusses the challenges of conventional multimodal representation learning approaches, and how ImageBind was able to overcome those challenges by leveraging the binding property of images. The approach reduces the need for large, cumbersome datasets, where all combinations of data modalities are present together, thus making it a transformative tool in the realm of artificial intelligence.

    We also send a huge thank you to the team MetaAI Research for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2305.05665.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • Welcome to the latest episode of the Paper Club Podcast! This episode, hosts Rafael and Marcia are joined by the insightful Leticia Fernandes, a senior data scientist and Generative AI Ambassador from Deeper Insights. Together, they dive deep into the intricacies of retrieval-augmented generation methods and the challenges AI faces with hallucinations. This month’s academic paper is "Active Retrieval Augmented Generation" by the brilliant minds at Sea AI Lab, Carnegie Mellon University and MetaAI Research. This paper introduces the Forward-Looking Active Retrieval Augmented Generation (FLARE) method, a revolutionary approach that enhances the responses of large language models by iteratively using their predictions to retrieve relevant documents. Listen in as the trio discusses the groundbreaking potential of this research, from enhancing business applications to redefining the boundaries of generative AI technologies.

    We also extend a special thank you to the teams at Sea AI Lab, MetaAI Research and Carnegie Mellon University for developing this month’s paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2305.06983.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • Welcome to the latest episode of the Paper Club Podcast. In this episode, hosts Rafael and Marcia, host special guest Dr. Catarina Carvalho, delve into the merging of neuroscience and AI. They discuss the academic paper "High-Resolution Image Reconstruction with Latent Diffusion Models from Human Brain Activity" by Takagi and Nishimoto from Osaka University, Japan. The paper explores a novel visual reconstruction method using latent diffusion models, which can reconstruct high-resolution images from human brain activity as measured by fMRI. This method does not require any training or fine tuning of complex deep learning models, making it a significant advancement in the field. Join us as we explore the implications of this research and its potential applications in various fields, including understanding brain functions, generative AI, and even some future versions of mind reading.

    We also extend a special thank you to Takagi and Nishimoto from Osaka University for their invaluable contributions to this month's paper. If you are interested in reading the paper for yourself, please check this link: https://www.biorxiv.org/content/10.1101/2022.11.18.517004v2.full.pdf.

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • Deeper Insights' lead AI solutions consultant, Cláudio Sá, joins hosts Rafael and Marcia in this episode of The Paper Club Podcast. The focus of this month’s episode is an academic paper from Microsoft's research team, titled 'LoRA: Low-Rank Adaptation of Large Language Models'. We explore LoRA in depth, discussing the numerous benefits of fine-tuning, the significant computational savings it offers, and its profound impact on model accuracy. Join us as we unravel the intricacies of this groundbreaking technique and explore its potential to reshape the future of AI.

    We also extend a special thank you to the Microsoft research team for their invaluable contributions to this month's paper. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2106.09685.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • In this episode of The Paper Club Podcast, hosts Rafael and Marcia are joined by returning guest Sonia Marques, a generative AI ambassador at Deeper Insights AI Consultancy, to discuss the academic paper "Segment Anything" by Meta AI Research. The paper introduces a revolutionary technology in the AI field, known as "promptable segmentation". The contributions of the paper are threefold: it outlines the proposed task, details the model architecture, and presents experimental results. The novelty of the model lies in its ability to perform segmentation tasks without requiring extensive, carefully annotated datasets, showcasing its zero-shot transfer knowledge.

    We extend a special thank you to Meta AI Research for their contributions to this month's paper. Don't miss out on this enlightening conversation about this new AI tool. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2304.02643.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business, please visit www.deeperinsights.com or reach out to us at [email protected].

  • Join Rafael and Marcia as they welcome Dr. Catarina Carvalho, a Senior Data Scientist specializing in Computer Vision, and Sónia Marques, a Data Scientist and Generative AI ambassador, both from Deeper Insights. Together, they will delve into the fascinating world of Stable Diffusion and generative image creation. In this episode, they discuss the 2022 paper "High-Resolution Image Synthesis with Latent Diffusion Models," which examines how latent diffusion models (LDMs) enhance image synthesis by using diffusion models in the latent space of pre-trained autoencoders, improving visual fidelity, reducing computational demands, and achieving high performance across multiple tasks.

    A special thank you to Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer for their contributions to this month's paper. Don't miss out on this enlightening conversation about generative image creation. If you are interested in reading the paper for yourself, please check this link: https://arxiv.org/pdf/2112.10752.pdf

    For more information on all things artificial intelligence, machine learning, and engineering for your business please visit www.deeperinsights.com or reach out to us at [email protected].

  • Join Rafael and Marcia as they welcome Matt Kidd, Senior Data Scientist (NLP) from Deeper Insights, for a discussion on InstructGPT, the predecessor of ChatGPT. The main discussion revolves around the impact of using alignment techniques, namely Reinforcement Learning from Human Feedback (RLHF), on the usefulness and widespread use of Large Language Models (LLMs). Centres around the paper "Training language models to follow instructions with human feedback" (2022) authored by the OpenAI team. They cover topics like alignment with human intentions, RLHF and the finer areas of what makes this paper a seminal paper for the generative AI communities. If you are interested in reading the paper and following along please click this link: https://arxiv.org/pdf/2203.02155.pdf

    For more information on all things artificial intelligence, generative AI, machine learning, and engineering for your business please visit www.deeperinsights.com or reach out to us at [email protected].