Episoder
-
In this episode of Data in Biotech, Ross Katz sits down with Dave Johnson, CEO and co-founder of Dash Bio, a next-gen drug development services company with a mission to revolutionize clinical bioanalysis and streamline drug development.
Dave begins the episode by taking us back to the early research days in Moderna, where he helped lay the groundwork for mRNA technology, which later enabled the development of a vaccine for COVID-19 at unprecedented speed. As he explains, this automated work and pre-built systems ultimately played a central role in responding to urgent health challenges.
He also shares his firsthand experience of working in a rapidly scaling pharma company, discussing the potential challenges that arose along the way and the lessons he learned to overcome them.
Dave then proceeds to highlight the most significant insufficiencies in drug development—particularly the lack of industrialization and standardization. He explains how Dash Bio aims to address these issues, focusing on clinical bioanalysis now and expanding to broader standardization later. The goal is ultimately to develop a more efficient, high-quality end-to-end system and improve the overall efficacy of the drug development process.
Finally, Dave and Ross discuss the misconceptions surrounding lab automation and emphasize the need for a shift of perspective within the drug development space. They also touch upon Dave’s vision for the future of Dash Bio, plus his advice for aspiring biotech data leaders eager to contribute to industry transformation.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers:
[1:38] Introduction to Dave Johnson and his career journey from Moderna to founding a next-gen drug development company
[2:57] Establishing mRNA technology groundwork in Moderna
[4:36] The challenges of scaling up COVID-19 vaccine development
[7:55] How rapid company growth impacts the organizational structure and engaging models
[11:03] The role of AI, automation, and machine learning in drug development
[12:45] Addressing the most significant insufficiencies in drug development and potential solutions
[16:31] The need for standardization and automation in drug development
[18:04] Current focus of Dash Bio on clinical bioanalysis
[19:37] The misconceptions surrounding lab automation and the need for a shift of perspective within the drug development space
[22:33] Dave’s vision for the future of Dash Bio and streamlining drug development
[25:16] The current state of lab automation
[27:41] The role of experimentation in Dash Bio's approach
[29:47] Advice for aspiring data scientists and leaders in the biotech sector
Useful Links
Dave Johnson LinkedIn
Dash Bio Website
-
This week, Chitrang Dave, Global Head of Enterprise Data & Analytics at Edwards Lifesciences, joins us to discuss the transformative power of real-time data, AI, and collaboration in medical device manufacturing and support.
He and host, Ross Katz, dive into how real-time data from IoT devices is reshaping quality assurance in medtech and what the future holds for medtech as big tech players like Apple and Meta enter the healthcare arena.
Together, they discuss everything from AI-powered patient identification to the integration of consumer wearables with FDA-approved medical devices. Tune in to hear how collaboration, innovation, and cutting-edge technology are improving patient outcomes and revolutionizing healthcare.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[01:36] Chitrang shares the experience that led him to work at leading data and analytics organizations and what work there is to be done
[04:09] Chitrang highlights the role of IoT devices in medical device manufacturing, where real-time data can drive automation and improve quality assurance
[06:25] What is driving innovation right now in research and development, and how companies like Apple are disrupting the medical device space
[09:23] Chitrang talks about how connectivity in devices and the expectation of the user to be able to use an intuitive interface are evolving into more real-time medical device technology
[11:47] The importance of keeping patient data private between the patient and the practitioner while using anonymized data to create solutions and identify patterns in health
[13:25] Using data to create a complete picture of the patient in order to make their life easier
[14:20] Chitrang discusses the challenge of manufacturing medical devices when there are issues with raw materials
[16:30] Chitrang discusses the potential for automation for real-time data in manufacturing
[19:17] Ross and Chitrang discuss the value of having comprehensive data to personalize treatments and ensure timely responses, especially for scenarios where early detection of Alzheimer’s can save trillions of dollars
[21:27] Chitrang mentions significant collaborations, such as the Cancer AI Alliance, where tech giants like AWS, Microsoft, NVIDIA, and Deloitte are working together to address critical problems in healthcare
[27:10] How real-time data from medical devices could improve patient outcomes, stakeholder coordination and future trends
[28:29] Closing thoughts and where to find Chitrang Dave online
Download CorrDyn’s latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Find the white paper online at: https://connect.corrdyn.com/biotech-ml
-
Mangler du episoder?
-
This week on Data in Biotech, we’re joined by Martin Permin, the co-founder of Invert, a company that builds software that automates bioprocessing.
Martin talks us through his own unique journey into biotech - starting from a role at Airbnb - through to co-founding Invert. Invert helps users grab data from their instruments, map out their individual processes, clean up the data for analysis, and look for ways to speed up the “mundane” data cleaning tasks that often take up the majority of one’s time.
With our host, Ross Katz, Martin tells us the statistical problems Invert works to solve for their different types of clients: biologic development labs, full-scale manufacturers, and CDMOs. While they all approach data cleaning and analysis from different directions, Invert can see how clients use the system and look for ways to automate repeated processes to help them save time.
They discuss implementing Invert into the Design, Build, Test, Learn Loop and why Invert is invested in reducing how many times one has to go around that loop. Martin explains how his company looks to reduce the risk in tech transfer in both directions, in terms of time and labor.
Then, the conversation moves to ML/AI, where Martin tells us how a lot of his customers are finding that the bottlenecks in their processes aren’t where they thought they were, thanks to using Invert for process automation.
Finally, Martin gives us his opinions on the future trends around the corner for the biotech industry - and how Invert is preparing themselves and their customers.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:29] Introduction to Martin and his journey into biotech
[4:10] Introduction to Invert - the what and why
[6:47] How Invert is implemented into a customer’s workflow
[11:36] The problems Invert can solve
[16:16] Design > build > test > learn… and how Invert facilitates that
[20:00] CDMOs and contractors - how Invert works with their different customers
[22:15] The use of ML/AI in bio-processing
[33:40] Trends in Biotech that will influence Invert over the long-term
-
This week on Data in Biotech, we’re joined by Mo Jain, the Founder and CEO of Sapient, a biomarker discovery organization that enables biopharma sponsors to go beyond the genome to accelerate precision drug development.
Mo talks us through his personal journey into the world of science, from school to working in academia to founding his business, Sapient.
He explains how and why Sapient first started and the evolution of the high-throughput mass-spectrometry service it provides to the biopharmaceutical sector.
Together with our host Ross, they explore the technology that’s allowed scientists to explore one's medical history like never before via metabolome, lipidome, and proteome analysis.
They look at how the technology developed to allow data testing to go from running twenty tests per blood sample to twenty thousand. How have Sapient built themselves up to such a renowned status in biopharmaceuticals for large-scale data projects?
They discuss Sapient’s process when working with clients on genome projects. We learn about Sapient’s relationship with their clients, how they understand the targets and aims of each project, why they put so much importance on proprietary database management and quality control, and Sapient’s three pillars for high quality data discovery.
Finally, Mo takes the opportunity to give us his insights on the future of biomarker discovery and mass-spectrometry technology - and how AI and Machine Learning are leading to enhanced data quality and quantity.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:33] Introduction to Mo Jain, his journey, Genomics, and Sapient’s use of Genomics data to accelerate Medicine and Drug Development
[6:50] The types of data generated at Sapient via metabolome, lipidome & proteome, and why that data is generated
[12:30] How Sapient generates this data at scale, via specialist mass-spectrometry technology
[14:48] The problems Sapient can solve for pharma and biotech companies with this data
[21:03] Sapient as a service company: the questions they’re asked by pharmaceutical businesses, why they come to Sapient, and Sapient’s process for answering those questions.
[26:23] computational frameworks and data handling side of things, and how the team interact with the client
[29:59] Proprietary database development and quality control
[35:27] The future of biomarker discovery and mass-spectrometry technology, and how AI and Machine Learning are leading the way at Sapient
-
This week on Data in Biotech, we are joined by Parul Bordia Doshi, Chief Data Officer at Cellarity, a company that is leveraging data science to challenge traditional approaches to drug discovery.
Parul kicks off the conversation by explaining Cellarity’s mission and how it is using generative AI and single-cell multiomics to design therapies that target the entire cellular system, rather than focusing on single molecular targets.
She gives insight into the functionality of Cellarity Maps, the company’s cutting-edge visualization tool that maps the progression of disease states and bridges the gap between biologists and computational scientists.
Along with host Ross Katz, Parul walks through some of the big challenges facing Chief Data Officers, particularly for biotech organizations with data-centric propositions.
She emphasizes the importance of robust data frameworks for validating and standardizing complex data sets, and looks at some of the practical approaches that ensure data scientists can derive the maximum amount of value from all available data.
They discuss what data science teams look like within Cellarity, including the unique way the company incorporates human intervention into its processes.
Parul also emphasizes the benefits that come through hiring multilingual, multidisciplinary teams and putting a strong focus on collaboration.
Finally, we get Parul’s take on the future of data science for drug discovery, plus a look at Cellarity’s ongoing collaboration with Novo Nordisk on the development of novel therapeutics.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:45] Introduction to Parul, her career journey, and Cellarity’s approach to drug discovery.
[5:47] The life cycle of data at Cellarity from collection to how it is used by the organization.
[7:45] How the Cellarity Maps visualization tool is used to show the progression of disease states
[9:05] The role of a Chief Data Officer in aligning an organization’s data strategy with its company mission.
[11:46] The benefits of collaboration and multidisciplinary, cross-functional teams to drive innovation.
[14:53] Cellarity's end-to-end discovery process; including how it uses generative AI, contrastive learning techniques, and visualization tools.
[19:42] The role of humans vs the role of machines in scientific processes.
[23:05] Developing and validating models, including goal setting, benchmarking, and the need for collaboration between data teams and ML scientists.
[30:58] Generating and managing massive amounts of data, ensuring quality, and maximizing the value extracted.
[37:08] The future of data science for drug discovery, including Cellarity’s collaboration with Novo Nordisk to discover and develop a novel treatment for MASH.
-
This week on Data in Biotech, Ryan Mork, Director of Data Science at Evozyne, joins host Ross Katz to discuss how data science and machine learning are being used in protein engineering and drug discovery.
Ryan explains how Evozyne is utilizing large language models (LLMs) and generative AI (GenAI) to design new biomolecules, training the models with huge volumes of protein and biology data. He walks through the organization’s evolution-based design approach and how it leverages the evolutionary history of protein families.
Ross and Ryan dig into the different models being used by Evozyne, including latent variable models and embeddings. They also discuss some of the challenges around testing the functionality of models and the approaches that can be used for evaluation.
Alongside the deep dive into data and modeling topics, Ryan also discusses the importance of relationships between the wet lab and data science teams. He emphasizes the need for mutual understanding of each role to ensure the entire organization pulls together towards the same goals.
Finally, Ross asks Ryan to opine on the future of GenAI and LLMs for biotechnology and how this area will develop over the next five years. He also finds out more about the R&D roadmap at Evozyne and its plans to play a part in moving GenAI for protein engineering forward.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:24] Introduction to Ryan, his career to date, and the focus of Evozyne.
[2:59] How the Evozyne data science team operates and the data sources it utilizes.
[4:22] Building models to develop synthetic proteins for therapeutic uses.
[9:10] Deciding which proteins to take into the lab for experimental validation.
[10:49] Taking an evolution-based design approach to protein engineering.
[14:34] Using latent variable models and embeddings to capture evolutionary relationships.
[18:01] Evaluating the functionality of generative models and the role of auxiliary models.
[24:24] The value of tight coupling and mutual understanding between wet lab and data science teams.
[28:07] Evozyne’s approach to developing and testing new data science tools, models, and technologies.
[31:35] Predictions for future developments in Generative AI for biotechnology.
[33:41] Evozyne’s goal to increase throughput and its planned approach.
[39:09] Where to connect with Ryan and keep up to date with news from Evozyne.
-
This week on Data in Biotech, Ross is joined by Jonathan Eads, VP of Engineering at genomics intelligence company Genomenon, to discuss how his work supports the company’s mission to make genomic evidence actionable.
Jonathan explains his current role leading the teams focused on clinical engineering, curation engineering, platform development and overseeing Genomenon’s data science and AI efforts.
He gives insight into how Genomenon’s software works to scan genomics literature and index genetic variants, providing critical evidence-based guidance for those working across biotech, pharmaceutical, and medical disciplines.
Jonathan outlines the issues with inconsistent genetic data, variant nomenclature and extracting genetic variants from unstructured text, before explaining how human curators are essential to ensure accuracy of output.
Jonathan and Ross discuss the opportunities and limitations that come with using AI and natural language processing (NLP) techniques for genetic variant analysis.
Jonathan lays out the process of developing robust validation datasets and fine-tuning AI models to handle issues like syntax anomalies and outlines the need to balance the short-term need for data quality with the long-term goal of advancing the platform’s AI and automation capabilities.
We hear notable success stories of how Genomenon’s platform is being used to accelerate variant interpretation, disease diagnosis, and precision medicine development.
Finally, Ross gets Jonathan’s take on the future of genomics intelligence, including the potential of end-to-end linkage of information from variants all the way out to patient populations.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:50] Introduction to Jonathan and his academic and career background.
[5:14] What Genomenon’s mission to ‘make genomic evidence actionable’ looks like in practice.
[14:48] The limitations of how scientists and doctors have historically been able to use literature to understand genetic variants.
[16:08] Challenges with nomenclature and indexing and how this impacts on access to information.
[18:11] Extracting genetic variants from scientific publications into a structured, searchable index.
[22:04] Using a combination of software processes and human curation for accurate research outputs.
[24:57] Building high functionality, complex, and accurate software processes to analyze genomic literature.
[29:45] Dealing with the challenges of AI and the role of human curators for the accuracy of genetic variant classification.
[34:37] Managing the trade-off between short-term needs for improved data and long-term goals for automation and AI development.
[38:39] Success stories using the Genomenon platform including making an FDA case and diagnosing rare disease.
[41:55] Predictions for future advancements in literature search for genetic variant analysis.
[43:21] The potential impact of Genomenon’s acquisition of Jack's Clinical Knowledge Base.
-
This week on Data in Biotech, we’re joined by Amy Gamber, VP of Manufacturing at Atara Biotherapeutics, an allogeneic T-cell immunotherapy company developing off-the-shelf treatments to help achieve faster patient outcomes.
As a treatment that sits at the cutting edge of options available for cancer and autoimmune disease, host Ross Katz explores how Atara is able to deliver personalized medicine that can be with the patient inside a three-day window.
Amy is clear-eyed about what works well in this field and what doesn’t. We gain insight into the complexity of developing this type of cell therapy and the subsequent production challenges of manufacturing at scale. We also cover the manufacturing process, corresponding data problems Amy encounters on a day-to-day basis in her role as VP of Manufacturing, and the strategies she employs to overcome them.
Amy discusses the importance of continuous quality monitoring and the need to introduce it from an early stage to see how a program changes through the development phases. She highlights the importance of data as a tool for the ‘detective work’ needed to understand where problems arise during manufacturing.
Finally, she and Ross close the episode by discussing the future of cell therapy manufacturing, a world where modeling enables predictive QC, the possibilities of AI, and the need to standardize data.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:56] Introduction to Amy and the manufacturing process at Atara, including the importance of cryo storage to facilitate faster patient treatment.
[6:37] Amy and Ross discuss the challenge of donor variability in cell therapy manufacturing and how to manage it.
[12:38] Ross asks about scaling cell therapy production and the different considerations for small batch versus commercial-scale manufacturing.
[15:47] Amy discusses the importance of continuous quality monitoring, highlighting the value of tracking metrics to ensure quality control and identify improvement opportunities
[18:46] Ross moves the focus to automating data collection, as he and Amy emphasize the need for more efficient data access and analysis for timely decision-making.
[20:50] Ross and Amy explore the data challenges biotechnology companies face, including the problem with manual data processes, creating feedback loops, and regulatory compliance.
[25:16] Amy explains how Atara addressed manufacturing efficiency challenges, the importance of ‘detective work’ to understand problem causes, and the process of solving them.
[33:28] Ross and Amy examine how to use data to gather meaningful manufacturing insights, particularly identifying true signals when analyzing small datasets.
[36:33] Ross talks about predictive QC measures as the solution to the point Amy makes about being able to guarantee product quality from the outset.
[37:31] Amy gives her perspective on the future of biotech manufacturing, the role of AI, predictive modeling, and the need for standardization in the industry.
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
It's a holiday week here in the U.S., and the CorrDyn team is currently getting together for our annual company retreat. So, we decided to do something a little different this week.
We're re-releasing abridged versions of two of our most popular episodes.
These are dedicated to a critical workflow at the intersection of data and scientific research - Design of Experiments.
Across these interviews, we brought together two leading experts to give you a comprehensive overview of how best-practice DOE works in the biotech industry - Wolfgang Halter from Merck Life Sciences and Markus Gershater from Synthace.
-
This week on Data in Biotech, we’re delighted to be joined by Guru and Satya Singh, co-founders of SciSpot, a company focused on transforming biotech companies through smarter embodiment of biological processes in data/software and acceleration of the R&D process.
They discuss how their respective biotech and data backgrounds led them to develop the platform and their very personal motivation behind their mission to enable data to accelerate life science research.
Guru and Satya explore the concept of giving biotech companies a “digital brain” that uses AI to learn from every experiment. They emphasize how this requires modern software principles like being API-first and data-centric.
Based on their work helping their biotech customers move towards this model, Guru and Satya discuss overcoming some of the biggest adoption challenges – instilling data competence, moving to standardized data models, and bridging the gap between wet lab scientists and computational experts.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:38] Guru and Satya both give a brief overview of their respective backgrounds and the industry challenges that led them to launch SciSpot.
[3:35] Guru discusses the challenges of bringing organic and inorganic intelligence together and introduces the concept of a “digital brain.”
[6:50] Ross asks about the components of the SciSpot platform and how it works for companies using it.
[9:42] Guru and Satya emphasize the challenge of educating scientists on the advantages of adopting an API-first, data science-focused system.
[12:09] Guru and Satya highlight the ‘a-ha’ moments for customers using the platform, which include standardizing data models and connecting all instruments into SciSpot.
[14:27] Satya discusses knowledge graphs and how the system enables both implicit tagging and human input to enrich the data for data science purposes.
[17:52] The discussion covers the need for flexible workflows in biotech and how SciSpot changes the way its customers think about data science workflows.
[22:44] Guru shares his views on the future of biotech companies and underlines the importance of standardized data models.
[25:59] The discussion covers the challenges of integrating biotech-specific systems into an API-first platform and the current gaps in data capabilities.
[29:45] Ross highlights the importance of a unified platform for the range of biotech personas to drive AI faster.
[31:32] Guru and Satya revisit their vision of biotech organizations with a “digital brain” and real-time, established feedback loops that will make them smarter.
[34:46] Guru and Satya share advice for biotech organizations, focusing on how they should think about data and tooling.
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, Ross sits down with Mike Nally, CEO at Generate:Biomedicines, a pioneer in generative biology that is transforming the way medicines are developed. Mike joined the Data in Biotech podcast to discuss the AI-driven drug development landscape and how data is set to change the way every drug is made in the future.
Mike shares his journey to Generate:Biomedicines, motivated by the ambition to improve productivity and democratize the availability of drugs.
He discusses the latest in drug development trends, from how the availability of data accelerates what is possible to breakthroughs in de novo generation that allow proteins to be developed with unprecedented specificity. He shares how Generate innovates at each phase of AI-driven drug development and provides insight into Chroma, an open-source diffusion model, explaining how it allows scientists to push the boundaries of protein discovery.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:21] Mike gives a quick rundown of his background and the route to his current role as CEO at Generate:Biomedicines.
[4:03] Mike discusses the changes in the availability of data to advance biotechnology.
[6:37] Mike explains the process of designing new proteins and where AI fits into this.
[11:12] Mike introduces Chroma, an open-source diffusion model from Generate:Biomedicines, and explains how it allows scientists to expand the natural universe of proteins.
[16:12] Ross and Mike discuss the challenge of combining biology and computer training.
[18:09] Mike gives his view on the current status of machine learning's role in biotech R&D and how this will evolve.
[21:05] Mike emphasizes the importance of human attention in AI-driven drug discovery and outlines how technological advancements require workflow innovation.
[26:13] Mike highlights teamwork, company culture, and ambition as key differentiators for Generate:Biomedicines.
[28:05] Ross asks Mike his perspective on skepticism around AI-discovered drugs
[30:25] Mike shares updates on the two leading candidates coming out of Generate:Biomedicines.
-
This week, Nathan Clark, CEO at Ganymede, joins the Data in Biotech podcast to discuss the challenges of integrating lab instruments and data in the biotech industry and how Ganymede’s developer platform is helping to automate data integration and metadata management across Life Sciences.
Nathan sits down with Data in Biotech host Ross Katz to discuss the multiple factors that add to the complexity of handling lab data, from the evolutionary nature of biology to the lab instruments being used. Nathan explains the importance of collecting metadata as unique identifiers that are essential to facilitating automation and data workflows.
As the founder of Ganymede, Nathan outlines the fundamentals of the developer platform and how it has been built to deal practically with the data, workflow, and automation challenges unique to life sciences organizations. He explains the need for code to allow organizations to contextualize and consume data and how the platform is built to enable flexible last-mile integration. He also emphasizes Ganymede's vision to create tools at varying levels of the stack to glue together systems in whatever way is optimal for its specific ecosystem.
As well as giving an in-depth overview of how the Ganymede platform works, he also digs into some of the key challenges facing life sciences organizations as they undergo digital transformation journeys.
The need to engage with metadata from the outset to avoid issues down the line, how to rid organizations of secret Excel files and improve data collection, and the regulatory risks that come with poor metadata handling are all covered in this week’s episode.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:28] Nathan gives a quick overview of his background and the path that led him to launch Ganymede.
[5:43] Nathan gives us his perspective on where the complexity of life sciences data comes from.
[8:23] Nathan explains the importance of using code to cope with the high levels of complexity and how the Ganymede developer platform facilitates this.
[11:26] Nathan summarizes the three layers in the Ganymede platform: the ‘core platform’, ‘connectors’ or templates, and ‘transforms’, which allow data to be utilized.
[13:18] Nathan highlights the importance of associating lab data with a unique ID to facilitate data entry and automation.
[15:05] Nathan outlines why the drawbacks of manual data association are inefficient, unreliable, and difficult to maintain.
[16:43] Nathan explains what using Ganymede to manage data and metadata looks like from inside a company.
[24:50] Ross asks Nathan to describe how Ganymede assists with workflow automation and how it can overcome organization-specific challenges.
[27:42] Nathan highlights the challenges businesses are looking to solve when they turn to a solution, like Ganymede, pointing to three common scenarios.
[34:32] Nathan emphasizes the importance of laying the groundwork for a data future at an early stage.
[37:49] Nathan and Ross stress the need for a digital transformation roadmap, with smaller initiatives on the way demonstrating value in their own right.
[40:35] Nathan talks about the future for Ganymede and what is on the horizon for the company and their customers.
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, Harshil Patel, Director of Scientific Development at Seqera, joins the Data in Biotech podcast to discuss the importance of collaborative, open-source projects in scientific research and how they support the need for reproducibility.
Harshil lifts the lid on how Nextflow has become a leading open-source workflow management tool for scientists and the benefits of using an open-source model. He talks in detail about the development of Nextflow and the wider Seqera ecosystem, the vision behind it, and the advantages and challenges of this approach to tooling.
He discusses how the nf-core community collaboratively develops and maintains over 100 pipelines using Nextflow and how the decision to constrain pipelines to one per analysis type promotes collaboration and consistency and avoids turning pipelines into the “wild west.”
We also look more practically at Nextflow adoption as Harshil delves into some of the challenges and how to overcome them.
He explores the wider Seqera ecosystem and how it helps users manage pipelines, analysis, and cloud infrastructure more efficiently, and he looks ahead to the future evolution of scientific research.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
---
Chapter Markers
[1:23] Harshil shares a quick overview of his background in bioinformatics and his route to joining Seqera.
[3:37] Harshil gives an introduction to Nextflow, including its origins, development, and the benefits of using the platform for scientists.
[9:50] Harshil expands on some of the off-the-shelf process pipelines available through NFcore and how this is continuing to expand beyond genomics.
[12:08] Harshil explains NFcore’s open-source model, the advantages of constraining pipelines to one analysis per type, and how the Nextflow community works.
[17:43] Harshil talks about Nextflow's custom DSL and the advantages it offers users
[20:23] Harshil explains how Nextflow fits into the broader Seqera ecosystem.
[26:08] Ross asks Harshil about overcoming some of the challenges that arise with parallelization and optimizing pipelines
[28:01] Harshil talks about the features of Wave, Seqera’s containerization solution.
[32:16] Ross asks Harshil to share some of the most complex and impressive things he has seen done within the Seqera ecosystem.
[35:42] Harshil gives his take on how he sees the future of biotech genomics research evolution.
---
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, we are pleased to have Stewart Fossceco, Head of Non-Clinical and Diagnostics Statistics at Zoetis and an expert in pharmaceutical manufacturing, join us on the Data in Biotech podcast.
We sat down with Stewart to discuss implementing and improving Quality Assurance (QA) processes at every stage of biotech manufacturing, from optimizing assay design and minimizing variability in early drug development to scaling this up when moving to full production. Stewart talks from his experiences on the importance of experimental design, understanding variability data to inform business decisions, and the pitfalls of over measuring.
Along with host Ross Katz, Stewart discusses the value of statistical simulations in mapping out processes, identifying sources of variability, and what this looks like in practice. They also explore the importance of drug stability modeling and how to approach it to ensure product quality beyond the manufacturing process.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
---
Chapter Markers
[1:39] Stewart starts by giving an overview of his career in biotech manufacturing.
[3:54] Stewart talks about optimizing processes to control product quality in the early stages of the drug development process.
[7:27] Ross asks Stewart to speak more about how to optimize and minimize the variability of assays to increase confidence in clinical results.
[12:11] Stewart explains the importance of understanding how assay variability influences results and how to handle this when making business decisions.
[14:13] Ross and Stewart discuss the issue of assay variability in relation to regulatory scrutiny.
[17:07] Stewart walks through the benefits of using statistical simulation tools to better understand how an assay performs.
[19:49] Stewart highlights the importance of understanding at which stage sampling has the greatest impact on decreasing variability
[22:09] Stewart answers the question of how monitoring processes change when moving to full production scale.
[26:39] Stewart outlines stability modeling and the importance of stability programs in biotech manufacturing.
[30:38] Stewart shares his views on the biggest challenges that biotech manufacturers face around data.
---
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, we are pleased to welcome to the Data in Biotech podcast Brendan McCorkle, CEO of SciNote, a cloud-based ELN (Electronic Lab Notebook) with lab inventory, compliance, and team management tools.
In this episode, we discuss how the priorities of ‘Research’ and ‘Development’ differ when it comes to the data they expect and how they use it, and how ELNs can work to support both functions by balancing structure and flexibility. We explore the challenges of developing an ELN that serves the needs and workflows of all stakeholders, making the wider business case for ELNs, and why, in the lab, paper and Excel need to be a thing of the past.
Brendan is upfront about the data challenges faced by biotechs, which do not have one-vendor solutions. He emphasizes the importance of industry collaboration and software vendors’ role in following the principles of FAIR data. We also get his take on the future of ELNs and how they can leverage AI and ML.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:40] Brendan gives a whistlestop tour of his career and the path to setting up SciNote.
[4:20] Brendan discusses the principles of FAIR data and the challenges of adhering to them in the biotech industry.
[6:15] Brendan talks about the need to balance flexibility and structure when collecting R&D data.
[13:34] Brendan highlights the challenge of catering to diverse workflows, even within the same company.
[16:05] Brendan emphasizes the importance of metadata and how vendors, like SciNote, can help collect it with flexible tools for data entry and post-processing.
[18:59] Ross and Brendan discuss how to create an ELN that serves all stakeholders within the organization without imposing creativity constraints on research scientists.
[21:57] Brendan highlights how benefits like improving loss reduction and efficiency form part of the business case for a tool like SciNote.
[24:25] Brendan shares real-world examples of how companies integrate SciNote into their organizations and the need to work with other systems and software.
[34:01] Ross asks for his advice to biotech companies considering implementing ELNs, particularly into their workflows.
[39:10] Brendan gives his take on incorporating ML and AI within SciNote.
---
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, we are pleased to be joined on the Data in Biotech Podcast by Pierre Salvy, who recently became the CTO at Cambrium, and his colleague Lucile Bonnin, Head of Research & Development at Cambrium.
As part of the Cambrium team behind NovaColl™, the first micro-molecular and skin-identical vegan collagen to market, Pierre and Lucile share their practical experiences of using AI to support protein design.
We ask why Cambrium, as a molecular design organization, decided to focus on the cosmetics industry and dig into the factors that have driven its success. From developing a protein programming language to the challenges of collecting and utilizing lab data, Pierre and Lucile give a detailed look under the hood of a company using data and AI to accelerate its journey from start-up to scale-up.
They also talk to host Ross Katz about the benefits of working as a cloud-native company from day zero, de-risking the process of scaling, and opportunities for new biomaterials.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
Chapter Markers
[1:34] Pierre and Lucile make a quick introduction and give an overview of Cambrium’s work using AI to design proteins with the aim of developing sustainable materials.
[4:00] Lucile introduces NovaColl™, and Pierre elaborates on the process of bringing Cambrium’s first product to market.
[7:37] Ross asks Pierre and Lucile to give an overview of the considerations and challenges of protein design.
[11:01] Pierre and Lucile explain how Cambrium works with potential customers to design specific proteins that meet or exceed their expectations.
[12:49] Ross and Pierre discuss how Cambrium approached developing the data systems it needed to explore the protein landscape and how the team optimized the lab set-up.
[18:04] Pierre discusses the protein programming language developed at Cambrium.
[21:24] Lucile and Pierre talk through the development of the data platform at Cambrium as the company has scaled and the value of being cloud-native.
[24:12] Lucile and Pierre discuss how they approached designing the manufacturing process from scratch and how to reduce risk at every stage, especially while scaling up.
[31:44] The conversation moves to look at how Cambrium will use the processes and data platform developed with NovaColl™ to explore opportunities for the development of new biomaterials.
[34:42] Pierre gives advice on how start-ups can be smarter when selecting an area of focus.
[36:27] Lucile emphasizes the importance of getting cross-organizational buy-in to ensure successful data capture.
[39:01] Pierre and Lucile recommend resources that may be of interest to listeners seeking more information on the topics covered.
---
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, we are pleased to welcome Jacob Oppenheim, Entrepreneur in Residence at Digitalis Ventures, a venture capital firm that invests in solutions to complex problems in human and animal health.
Jacob sat down with Ross to discuss the importance of establishing strong data foundations in biotech companies and how to approach the task. We explore the challenges biotech organisations face with existing tools. What are the limitations, and why are current data tools and systems not yet geared toward helping scientists themselves extract meaningful insights from the data?
We also get Jacob’s take on AI in the biotech space and what is needed for it to reach its full potential, plus some of the opportunities new modelling capabilities will allow scientists to explore.
Finally, we looked at the topic of building a team, how to approach this within a start-up, and the role consultancies play in providing expertise and guidance to early-stage biotech companies.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in the life sciences.
--
Chapter Markers:
[1:08] Jacob gives a quick overview of his career to date and explains how he landed in his current role at Digitalis Venture and what differentiates it as a venture fund.
[07:42] Ross asks Jacob about the biggest challenges and opportunities facing data scientists, data teams, and start-ups more broadly.
[9:56] Jacob talks about the limitations of existing data management tools within biotech companies.
[13:55] Jacob discusses what is needed as a foundation for AI tools to reach their potential.
[17:12] Jacob argues for the need for a unified data ecosystem and the benefits of a modular approach to tooling.
[23:42] Jacob explains that biology has become more engineering-focused and how this allows data to guide drug development.
[26:14] Ross and Jacob discuss the challenges of integrating data science and biotech teams, including cultural clashes and tooling conflicts.
[32:52] Jacob emphasises the importance of consultancies in the biotech space, particularly for start-ups.
[36:21] Ross asks what the new modelling capabilities are that he is most excited about and how they will drive the industry forward.
[38:45] Jacob shares his advice for scientists and entrepreneurs looking to start a biotech venture and recommends resources.
--
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week's guest is Joseph Pearson, Global Product Manager of OmicSoft at QIAGEN, a global provider of sample-to-insight solutions that enable customers to gain valuable molecular insights.
During this episode, we dive into OmicSoft, a powerful NGS analysis suite that can quickly explore and compare 500,000 curated omics samples from disease-related studies. Joseph outlines the challenges of acquiring and analysing NGS data sets, how customers can interact with OmicSoft data, and what he thinks of the build versus buy debate when selecting new bioinformatics tools.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in life sciences.
Chapter Markers:
[01:33] Joseph gives us a brief introduction to his career and how he got to the position that he has today.
[03:39] Ross asks Joseph about QIAGEN and how OmicSoft complements the existing range of products the company already provides.
[05:09] Joseph talks about the work that is going into their NGS datasets and how the company is extracting value from those datasets.
[06:09] Ross asks Joseph about the types of customers that use this solution.
[13:06] Joseph clarifies where the data underlying OmicSoft comes from.
[19:29] Ross asks Joseph how the company approaches educating the customer.
[22:44] Joseph explains the decision-making process that companies go through when deciding to either build or buy.
[27:15] Ross asks Joseph about the biggest challenges or criticisms people have about the platform.
[31:07] Joseph explains how his biology background has shaped his view of the challenges he faces in his role in product management.
[34:11] Joseph tells us where we can find out more about OmicSoft and QIAGEN.
---
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week's guest is Wolfgang Halter, Head of Data Science and Bioinformatics at Merck Life Science, a leading global science and technology company.
Ross sat down with Wolfgang to discuss the work on the BayBE project, an open-source library built for Bayesian optimization. Throughout the episode, we go on to learn how BayBE is used for both experimental design and as a means to accelerate innovation. The pair also discusses the benefits and challenges of Bayesian optimization and the need for standardised data models. Finally, Wolfgang shares some advice for those scientists and engineers who are keen to get ahead in the industry.
You can access the GitHub repo mentioned in the episode by clicking here: github.com/emdgroup/BayBE
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in life sciences.
--
Chapter Markers:
[1:32] Wolfgang gives us a whistle-stop tour of his career to date and explains the motivation behind pursuing a career in Data Science.
[2:35] Ross asks Wolfgang about Merck’s mission and the role the data science team is playing in helping the company achieve that mission.
[5:28] Wolfgang explains the work that is going into the BayBE project.
[13:23] Ross asks Wolfgang how Merck arranged their experimental campaigns in BayBE and how they garnered insights during the process.
[17:45] Wolfgang explains why the team developed BayBE as an open-source library.
[19:25] Wolfgang shares some more details on how the data science team at Merck is using BayBE today.
[20:42] Wolfgang shares some examples of the kinds of applications that the team is currently developing.
[21:54] Wolfgang provides us with information about the amount of time that is saved on average as a result of adopting this approach.
[34:38] Ross asks Wolfgang how his engineering background informs his perspective on the problems facing biotech and R&D.
[36:57] Wolfgang gives us his advice for young scientists and engineers who are looking to learn more about biotech.
[38:24] Wolfgang provides us with a list of resources for those who want to find out more about Merck and the BayBE project.
--
Download our latest white paper on “Using Machine Learning to Implement Mid-Manufacture Quality Control in the Biotech Sector.”
Visit this link: https://connect.corrdyn.com/biotech-ml
-
This week, we’re delighted to be joined by Annemie Ribbens, VP Science, Evidence and Trials at icometrix, a medical technology manufacturer that offers a portfolio of AI solutions to assist healthcare with various challenges in neurological disorders, such as brain trauma, strokes, dementia, and Alzheimer's disease.
During this episode, Annemie opens up on icometrix’s mission in analyzing and treating neurological disorders, the work that went into developing the data infrastructure and the challenges they face when dealing with such large data sets. Annemie also goes on to discuss how machine learning will influence the application of precision medicine in biotech over the next five years and the goals that the company is looking to achieve in the future.
Data in Biotech is a fortnightly podcast exploring how companies leverage data innovation in life sciences.
Chapter Markers:
[1:14] Annemie provides us with a brief introduction into her background and what led her to pursue a career in this field.
[4:17] Ross asks Annemie about icometrix’s portfolio and what differentiates it from the other tools in the market.
[7:36] Annemie explains how icometrix are helping physicians improve both their understanding and treatment of particular disorders.
[13:31] Annemie dives into the role that the public patient facing app plays and how the data that it gathers feeds the ecosystem.
[22:03] Annemie reveals how their partnerships work.
[28:11] Ross asks Annemie to provide some insights into how icometrix went about developing their data infrastructure.
[31:23] Annemie shares the channels involved when processing and analysing large data sets.
[40:01] Annemie explains the methodology that enables icometrix to know what core areas to focus on.
[43:07] Annemie reveals one of the projects that she is most proud of.
[45:00] Annemie gives us her thoughts on what the future holds for machine learning.
[47:33] Annemie explains where listeners can go to find out more information on icometrix.
---
If you’re a biotech company struggling to unlock a data challenge, CorrDyn can help.
Whether you need to supplement existing technology teams with specialist expertise or launch a data program that lays the groundwork for future internal hires, you can partner with Corrdyn to unlock the potential of your business data - today.
Visit connect.corrdyn.com/biotech to learn more.
- Se mer