Episodios
-
Prepare to be amazed in this episode as Matteo Pelati and Vivek Gudapuri, the brilliant minds behind Dozer, reveal their experience in pushing the boundaries of data management and analysis. By simplifying the process of data serving and allowing companies to create APIs quickly and efficiently, Dozer's approach sets them apart from the modern data stack. Their open-source approach allows developers to build custom operators and extend connectors, ensuring that Dozer can cover a wide range of use cases while still offering customization at each step. They also discuss the challenges they faced during the development of Dozer and how they are positioned to adapt to upcoming trends and developments in real-time data processing.
-
Uncover the secret to turning data engineering into a superpower! As Sean Knapp, the CEO and founder of Ascend.io, joined us and discussed the value of depth and breadth in capturing the entire data value chain, emphasizing the need for an automation layer to adapt to the evolving data landscape. Ascend's platform enables intelligent data pipeline creation and management, with a dynamic control plane that detects and responds to changes in real time across extensive pipeline networks. Sean further explored the potential of generative AI in data engineering & his optimism about the future of the modern data stack, foreseeing consolidation and the emergence of new parallel spaces in the data ecosystem.
-
¿Faltan episodios?
-
Step into the world of Zalando, Europe's leading online fashion retailer, where data drives innovation and enhances the customer experience. In this episode, join us as we interview Dr. Alexander Borek, the brilliant mind behind Zalando's data and analytics strategy. Discover how Dr. Borek and his team have revolutionized the company's approach to data by implementing the cutting-edge concept of data mesh. Learn how Zalando successfully strikes the perfect balance between decentralization and structure, unleashing the full potential of data while maintaining collaboration with various business units. Dr. Borek also unveils the secrets to leveraging data for innovation and value creation in the dynamic world of online fashion. Tune in now for an eye-opening exploration of data management, leadership, and the future of data-driven decision-making at Zalando. -
Twilio has built an open source data lake using AWS technologies and Databricks, processing billions of events daily through their Kafka environment. They aim to provide a cohesive view of data across platforms and enable other businesses to use data wherever they want. Don, the Head of Data Platform and Engineering at Twilio, shares insights into Twilio's data stack in the latest episode of the Modern Data Show. The conversation covers the Twilio data stack, which begins with data ingestion through Kafka or CDC for Aurora databases, followed by storage in S3, high-level aggregation and curation using Spark, and the use of tools such as Kudu, Reverse ETL, data governance, cataloging, and BI tools.
-
Did your business ever face challenges to sync live data to your sales, marketing, and customer success tools? Then this is where you need Hightouch, a Reverse ETL platform that syncs data from a data warehouse to SaaS tools in minutes. It enables businesses to get accurate customer data quickly without requiring engineering effort or manual work. In this episode, Tejas Manohar shared his journey from developing games at a young age to becoming the Co-founder and CEO of Hightouch. He provided valuable insights into Hightouch's internal connector framework, which automatically performs tasks like change data capture and batching, as well as providing methods to send rows that may need to be retried in future syncs. He also talked about Hightouch's two new products and the future of reverse ETL.
-
When working with open-source technologies, you benefit from the community's creations, but you also have to do a lot of admin and support work as the technologies tend to break, and support usually falls on yourself. This is where DoubleCloud's platform comes into the picture. In this latest episode of the Modern Data Show, Natalia Shuliak talks about how DoubleCloud saves you from administrative work and allows you to focus on data pipeline development and management, while providing backup, security, and support.
-
With its widespread popularity and success in the e-commerce industry, it is difficult to imagine anyone who has not at least heard of Shopify. This episode features Marc Laforet, a senior data engineer at Shopify, who shares his journey of how he transitioned from being a biochemist to a data engineer at Shopify. Marc explains the type of data Shopify works with, which is diverse in format and comes from different sources, and how the company determines which tools to build to extract the most value from the data. Marc also discusses data governance and explains two possible architectures: a gating process or a trust-but-verify approach.
-
Urban Sports Club, a company that connects fitness enthusiasts started their data journey when they realised treating data as a product instead of a by-product could help them unlock the value of data. In the latest episode of the Modern Data Show, we are joined by Artur Yatsenko, Head of Data Platform at Urban Sports Club to discuss the company's platform, its evolving data stack, and the challenges faced while building it. Arthur shared insights on adopting open-source software and tools for data management and implementing data as a product strategy.
-
Salesforce is moving towards a more user-friendly and modernized data platform that allows for faster migration and operation, while also enabling users to take advantage of new functionalities that were previously unavailable. In the latest episode of the Modern Data Show, Murali Kallen, Head of Office of Data at Salesforce discusses the Snowflake modernization efforts, including migrating to Snowflake and adopting cloud-friendly tools. Murali also covers the importance of vendor support structures for established companies and the consideration of open-source versus commercial offerings.
00:00:00 Introduction
00:03:12 Data platform at Salesforce
00:07:53 Structure of Salesforce's data team
00:12:28 Data tool buying criteria from the data leader's perspective
00:23:05 Partnership with Snowflake
00:27:24 Future of data space
-
With the introduction of the Data Mesh concept a lot of people are trying to wrap their heads around the term, In the latest episode of the Modern Data Show, Colleen Tartow Director Of Engineering at Starburst Data provides a comprehensive explanation of what data mesh actually is, the socio-technical aspect of data mesh and the fundamental shift in the way data is produced and governed within an organization.
-
Lauren Balik, who runs Upright Analytics and is a leading data consultant and investor, discusses why she believes the modern data stack is flawed and the three factors that affect the cost of a data platform. Balik also compares building versus buying a data platform and recommends an OLAP database in the cloud for small companies. However, she thinks centralizing data out of a line of business is a mistake for larger companies. Balik does not anticipate consolidation in the modern data stack and thinks that large language models such as GPT-3 will be crucial.
-
Ian Macomber, Head of Analytics Engineering & Data Science at Ramp, discusses the company's approach to automating finance tools and building the next generation of finance through data-driven decision-making. Macomber emphasizes the importance of cross-functional collaboration and embedding the data team into every part of the product engineering process. He also highlights the need for data compliance and privacy to be invested in every day and not treated as a one-time effort. Macomber warns against "Layerinitis," where teams prioritize quick solutions over long-term effects, and advises celebrating the hardening of code and inviting people into codebases to teach them best practices.
-
In this episode of Modern Data Show Gunnar Morling discussed his interest in software engineering and databases and his recent move to Decodable, a real-time stream processing platform based on Apache Flink. He talked about the importance of cohesive data pipelines, from source to sink, and how his work with Debezium led him to become interested in stream processing. Gunnar also discussed how Decodable provides managed stream processing based on Apache Flink, ingesting real-time data streams and processing them, and putting the data into other systems.
-
In this episode of the Modern Data Show, Brennon York, Head of the Data Platform at Lyft, gives insights into the critical aspects of the data platform ecosystem in the early stages when there is no scale. Brennon also discusses the structure of the data platform team and new emerging technologies within the modern data stack that have impressed him, such as machine learning orchestration systems like SageMaker, Union-ai, and Flyte. The episode provides valuable insights into building a data platform that can scale with the growth of a company, enabling businesses to stay competitive in the fast-paced technological landscape.
-
In this episode of the Modern Data Show, host Aayush Jain is joined by Kai Waehner, the Global Field CTO at Confluent, to discuss all things about Apache Kafka, Confluent, and event streaming. Confluent is a complete event streaming platform and fully managed Kafka service used by tech giants, modern internet startups, and traditional enterprises to build mission-critical scalable systems. During the podcast, Kai discusses the benefits of using Confluent over deploying Kafka, the role of a global Field CTO, and the company's complete data streaming platform.
-
'Data as oil' is an extensively used metaphor and its impact can be gauged by how every business is heavily dependent on the data provided to them by 3rd party sources. Source data systems are finite, they have a certain amount of data with a limited associated scope. This is where Snowlplow comes in and helps businesses deliberately create that data. In the latest episode of the Modern Data Show, we have Alex Dean, CEO and Co-founder of Snowplow data discuss data creation, behavrioul analytics, data contracts, tracking catalog and where the modern data stack is heading in 2023.
-
When Michel and his team founded Airbyte back in 2020 there were already a ton of data integration tools out there and by 2020, it was a pretty mature space altogether. So what led them to start this company and what unique problem did they aim to address? To answer this, for this week's episode we have Michel Tricot, the co-founder and CEO of Airbyte.
-
Headless BI is one of the new and emerging categories of the Modern Data Stack. Although the concept of Headless has existed for quite a long in terms of Headless CMS, why is there a need for a Headless BI tool? Why should anyone care about Headless BI? To answer these questions and all the other technical complexities around Headless BI we have Igor Lukanin from Cube -a Headless BI solution for building data apps.
-
For early-stage startups, sometimes bringing in full-fledged data observability can be overkill. Even if an established organisation starts monitoring their data quality, it's often hard to judge if it is a tech problem or a people problem. In the latest episode of the Modern Data Show, Shane Murray, who went on from being a customer of Monte Carlo to later joining them as their field CTO, helps us understand these problems and how the Monte Carlo tool, using software engineering principles, is addressing the issue of data downtime.
-
Mark Van de Wiel is the Field CTO at Fivetran, the leader in automated data integration, delivering ready-to-use connectors to thousands of customers globally. Mark has a strong background in data replication and real-time business intelligence and analytics. Before joining Fivetran, Mark was the CTO at HVR Software which provides a real-time cloud data replication solution to support enterprise modernization efforts. HVR Software was acquired by Fivetran in 2021.
- Mostrar más