エピソード

  • Many companies are sitting on data assets that could be revenue streams for them, without knowing it. Matt Staudt of VDC discusses making latent data profitable.
    Ginette: I'm Ginette, Curtis: and I'm Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics, training, and consulting company. Ginette: Today, we chat with the president and CEO at the Venture Development Center, Matt Staudt. Matt Staudt: The company that I'm with is VDC, Venture Development Center. Basically VDC is an organization that works in the alternative big data, bringing buyer and seller together. So we have a unique perspective on available data assets that are out in the marketplace and a unique perspective of the companies that utilize them, and what they're specifically looking for in the way of points of, uh, value for various data assets. My background was originally in the marketing and advertising area, where I owned a company for 20 years, IMG, Interactive Marketing Group. I left that in 2007 and joined this, which was more or less of a lifestyle organization. And we made it a full-fledged organization company back in 2010.Curtis: Now, when you say data assets, can you put a little bit of definition around that for the listeners? Just so they understand how you define a data asset? 'Cause I imagine there may be some things that you think are valuable that maybe they haven't thought of, or maybe it'll help expand our thinking around what a data asset is.Matt: Yeah, sure. In my, in my terminology "data asset" basically falls into eight different categories, where assets basically come from within the information world. So they could be things like transaction data or crowdsource data. They could be things like search data or social data sets. They fall into various categories, traditional data, meaning assets that are business to business or business to consumer generally aggregated by large companies that most everybody's heard of Dun & Bradstreet, Infogroup, Axcium, the credit bureaus, et cetera. Alternative data in our world are companies that have unique data points, unique. They're collecting unique pieces of information, usually as a byproduct of their core business. And we look at the assets that the data sets, the actual data points that they collect. And we figure out if there might be something of value to take to the marketplace, usually to the large consumers of the data, the big aggregators that I previously mentioned, but oftentimes it also fits well with some of our mid-tier players. And we have a significant amount of relationships in the brand grouping, meaning large organizations that they themselves are looking to try and take advantage of big data and utilize data in sales, marketing operations, in order to transform or help to administer certain activities that they have going on.Curtis: Do you find that this is maybe industry specific, like for example, a big insurance company, or if you're in healthcare or something like this, it tends to be more data intensive that you see more activity there or, or is this really applicable across the board? What kind of industries do you find have a lot of applications?Matt: Yeah. Well, it's interesting on the surface, you certainly think that there's probably industries that would have a larger appetite and a larger need for data than, than other organizations, but going, you know, through the list of companies that we've helped over the last 15 or 20 years, it really runs the gamut. I mean, we've worked with insurances, you mentioned insurance, insurance companies. I mentioned credit bureaus. We work with credit bureaus, risk and fraud, sales and marketing, sometimes large brands within those retail environments. So it really truly has run the gamut for us. There's,

  • エピソードを見逃しましたか?

    フィードを更新するにはここをクリックしてください。

  • With recent events being what they are, epidemiology has come into the spotlight. What do epidemiologists do and how does data shape their everyday experience? Sitara and Mee-a from "Donuts and Data" fill us in.    
    Ginette: I'm Ginette, Curtis: and I'm Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Many people are on the lookout for online math and science resources right now, particularly data and statistics courses, and whether you're a student looking to get ahead, a professional brushing up on cutting-edge topics, or someone who just wants to use this time to understand the world better, you should check out Brilliant. Brilliant’s thought-provoking math, science, and computer science content helps guide you to mastery by taking complex concepts and breaking them up into bite-sized understandable chunks. You'll start by having fun with their interactive explorations, over time you'll be amazed at what you can accomplish. Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription. Now onto the show. Curtis: I'd like to welcome Sitara and Mee-a from the Instagram account Donuts and Data to talk to us today. I guess let's just have you guys introduce yourselves, as opposed to me trying to introduce you cause you know what you do better than I do. So maybe we just have some introductions. Sitara: So I'm Sitara one half of Donuts and Data. I'm a PhD student in epidemiology at the University of Texas Health Science Center. I'm also a research assistant in a lab that I work in. Mee-a: And I'm Mee-a. I am an infectious disease epidemiologist that works in the public sector. I actually met Sitara through the lab that she's currently working in. Curtis: Nice. And I'm excited to have you guys on. I just, I think epidemiology is a really interesting space, especially with what, you know, with what's going on now with COVID. I think it's more pertinent than it ever has been. Not that it ever hasn't been pertinent, but maybe it's more top of mind for people. So I'd love maybe just to have you guys level set with everybody, like what is epidemiology. There's probably some confusion about what that is and maybe how you guys got into it. And then we can get into what your day to day is and, and what it's all about. Sitara: So, epidemiology, I think everyone's kind of understanding is setting patterns of disease in the, in the human population. And so in that sense, what Mee-a and I do are the same, but instead of studying infectious diseases or the natural science part of epidemiology, what I focus on is how human behavior contributes to those patterns of disease. So I look for patterns in data associated like demographics or just behaviors, diet, nutrition, and how that contributes to getting diseases. Mee-a: For me in the public sector, it's going to be a lot of looking at incidents, rates of infectious diseases. It . . . primarily with COVID-19 right now, and just different ways that we can try to possibly implement infection prevention measures. So we are dealing a little bit more with, I don't want to say the medical side of it because we aren't clinicians, but we are dealing more with the medical side of, of the infectious disease than we are with, with the data compared to when I was in academia, at least. Curtis: So take us through maybe the end goal, right? So what you guys are working on. You're hoping to come out with, I think, some recommendations for people to, to take maybe a better understanding of how the disease spreads, so we get in front of it. What does that look like? Mee-a: I always thought that epidemiology's gold standard of what we try to achieve is probably...

  • For David Guralnick, education, AI, and cognitive psychology have always held possibility. With many years of experience in this niche, David runs a company that designs education programs, which employ AI and machine learning, for large companies, universities, and everything in between.  
    David Guralnick: Somehow what's happened in a lot of the uses of technology and education to this point is we've taken the mass education system that was there only to solve a scalability problem, not because it was the best educational method. So we've taken that and now we've scaled that even further online because it's easy to do and easy to track. Ginette Methot: I’m Ginette, Curtis Seare: and I’m Curtis, Ginette: and you are listening to Data Crunch, Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company. Curtis: First off, I'd like to thank everyone who has taken the Tableau fundamentals zombie course that we announced the last episode. We've been getting a lot of great feedback from you. It's fun to see how people are enjoying the course and thinking that it's fun and also clear and it's helping them learn the fundamentals of Tableau. The reason we made that course is because Tableau and data visualization are really important skills. They can help you get a better job, they can help you add value to your organization. And so we hope that the course is helping people out. Also, according to the feedback that we have received, we've made a couple of enhancements to the course, so there are now quizzes to test your knowledge. There are quick tips with each of the videos to help you go a little bit further than even what the videos teach. We've also included a way to earn badges and a certificate so that you can show off your skills to your employer or whoever. And we've also thrown in a couple other bonuses. One is our a hundred plus page manual that we actually use to train at fortune 500 companies so that'll have screenshots and tutorials and tips and tricks on the Tableau fundamentals. And we have also included a checklist and a cheat sheet, both of which we actually use internally in our consulting practice to help us do good work. One of them will help you know which kind of chart to use in any given scenario that you may encounter, whether that's a bar chart or a scatter plot or any number of other more advanced charts. And the other is a checklist that you can run down and say, "do I have this, this, this and this in my visualization before I take it to present to someone to make sure that that's going to be a good experience." So hopefully all of that equals something that is really going to help you guys. And something also where you can learn Tableau and have fun doing it, saving the world from the zombie apocalypse, and the price has risen a little bit since last time. But for our long-time listeners here, if you use the code "podcastzombie" without any spaces in the middle, then that'll go ahead and take off 25% of the list price that is currently on the page. So hopefully more of you guys can take it and keep giving us feedback so we can keep improving it. And we would love to hear from you Ginette: Now onto the show today. We chat with David Guralnick, president and CEO of kaleidoscope learning. David: I've had a long time interest in both education and technology going way, way back. I was, I was lucky enough to go to an elementary school outside of Washington DC called Green acres school in Rockville, Maryland, which was very project based. So it was non-traditional education. You worked on projects, you worked collaboratively with people, your teachers' role was almost as much an advisor and mentor as a traditional teacher. It wasn't person in front of the room talking at you, and you learn how to, you know,

  • If you've ever tried to find a doctor in the United States, you likely know how hard it is to find one who's the right fit—it takes quite a bit of research to find good information to make an informed choice. Wouldn't it be nice to easily find a doctor who is the right fit for you? Using data, Covera Health aims to do just that in the radiology specialty.



    Ron Vianu: I think the tools are really improving year over year to a significant degree, but like anything else, the tools themselves are only as useful as how you apply them. You can have the most amazing tools that could understand very large datasets, but you know how you approach looking for solutions, I think can dramatically impact. Do you yield anything useful



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    If you're a business leader listening to our podcast and would like to move 10 times faster and be 10 times smarter than your competitors, we're running a webinar on February 13th where you can learn how to do this and more. Just go to datacrunchcorp.com/go to sign up today for free.



    If you're a subject matter expert in your field, like our guest today, and you're looking to understand data science and machine learning, brilliant.org is a great place to dig deeper. Their classes, help you understand algorithms, machine learning concepts, computer science basics, and many other important concepts in data science and machine learning. The nice thing about brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code writing and interactive challenges, which makes them entertaining, challenging, and educational. Sign up for free and start learning by going to brilliant.org/data crunch. And also the first 200 people that go to that link will get 20% off the annual premium subscription.



    Today we chat with Ron Vianu, the CEO of Covera Health. Let's get right to it.



    Curtis: What inspired you to get into what you're doing, uh, to start Covera health? Where did the idea come from and what drives you? So if we could start there and learn a little bit about you and the beginnings of Covera health, that would be great.



    Ron: Sure. Uh, and I, I guess it's important to state that, you know, I'm a problem solver by nature, and my entire professional career, I've been a serial entrepreneur building companies to solve very specific problems. And as it relates to Covera, the, the Genesis of it was understanding that there were two problems in the market with respect to, uh, the healthcare space, which is where we're focused that were historically unsolved and there were no efforts really to solve them in, from my perspective, a data-driven way. And that was around understanding quality of physicians that is predictive to whether or not they'll be successful with individual patients as they walk through their practice. And so if you, and we're focused on the world of radiology, which today is highly commoditized and what that means is that there was a presumption that wherever you get an MRI or a CT study for some injury or illness, it doesn't matter where you go.



    It's more about convenience and price perhaps. Whereas what we understand given our research and the, the various things that we've published since our beginning is that one, it's like every other medical specialty. It's highly variable. Two, since radiology supports all other medical specialties in a, as a tool for diagnosis, diagnostic purposes, any sort of variability within that specialty has a cascading effect on patients downstream. And so for us, the beginning was, is this something that is solvable through data?

  • We talk with Ben Jones, CEO of Data Literacy, who's on a mission to help everyone understand the language of data. He goes over some common data pitfalls, learning strategies, and unique stories about both epic failures and great successes using data in the real world.



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    It’s becoming increasingly important in our world to be data literate and to understand the basics of AI and machine learning, and Brilliant.org is a great place to dig deeper into this and related topics. Their classes help you understand algorithms, machine learning concepts, computer science basics, and many other important concepts in data science and machine learning. The nice thing about Brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational.



    Sign up for free and start learning by going to Brilliant.org/DataCrunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.



    Curtis: Ben Jones is here with me on the podcast today. This is a couple months coming. Excited to have him on the show. He's well known in the data visualization community, he's done a lot of great work there. Uh, used to work for Tableau. Now he's off doing his own thing, has a company called Data Literacy, which is interesting. We're going to dig into that and also has a new book out called Avoiding Data Pitfalls. So all of this is really great stuff and we're happy to have you here, Ben. Before we get going, just give yourself a brief introduction for anyone who may not know you and we can go from there.



    Ben: Yeah, great. Thanks Curtis. You mentioned some of the highlights there. I uh, worked for Tableau for about seven years running the Tableau public platform, uh, in which time I wrote a book called Communicating Data with Tableau. And the fun thing was for me that launched kind of a teaching, um, mini side gig for me at the University of Washington, which really made me fall in love with this idea of just helping people get excited about working with data. Having that light bulb moment where they feel like they've got what it takes. And so that's what caused me to really want to lead Tableau and launch my own company Data Literacy at dataliteracy.com which is where I help people, you know, as I say, learn the language of data, right? Whether that's reading charts and graphs, whether that's exploring data and communicating it to other people through training programs to the public as well as working one on one with clients and such. So it's been a been an exciting year doing that. Also, other things about me, I live here in Seattle, I love it up here and go hiking and backpacking when I can and have three teenage boys all in high school. So that keeps me busy too. And it's been a fun week for me getting this book out and seeing it's a start to ship and seeing people get it.



    Curtis: Let's talk a little bit about that because the book, it sounds super interesting, right? Avoiding Data Pitfalls, and there are a lot of pitfalls that people fall into. So I'm curious what you're seeing, why you decided to write the book, how difficult of a process it was and then some of the insights that you have in there as well.



    Ben: Yeah, so I feel like the tools that are out there now are so powerful and way more so than when I was going to school in the 90s, and it's amazing what you can do with those tools. And I think also it's amazing that it's amazing how easy it is to mislead yourself. And so I started realizing that that's sometim...

  • How do you build a comprehensive view of a topic on social media? Jordan Breslauer would say you let a machine learning tool scan the social sphere and add information as conversations evolve, with help from humans in the loop.



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    Ginette: Many of you want to gain a deeper understanding of data science and machine learning, and Brilliant.org is a great place to dig deeper into these topics. Their classes help you understand algorithms, machine learning concepts, computer science basics, probability, computer memory, and many other important concepts in data science and machine learning. The nice thing about Brilliant.org is that you can learn in bite-sized pieces at your own pace. Their courses have storytelling, code-writing, and interactive challenges, which makes them entertaining, challenging, and educational.



    Sign up for free and start learning by going to Brilliant.org slash Data Crunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.



    Let’s get into our conversation with Jordan Breslauer, senior director of data analytics and customer success at social standards.



    Jordan: My name is Jordan Breslauer. I'm the senior director of data analytics and customer success at social standards. I've always been a data geek as it pertains to sports. I think of Moneyball when I was younger, I always wanted to be kind of a the next Billy Bean and I, when I started working for sports franchises right after high school and early college days, I just realized that, that type of work culture is wasn't for me, but I was so, so into trying to answer questions with data that had no previously clear answer, you know? I loved answering subjective questions like, or what makes the best player or how do, how do I know who the best player is? And I thought what was always fun was to try and bring some sort of structured subjectivity to those sorts of questions through using data. And that's really what got me passionate about data in the first place.



    But then I just started to apply it to a number of different business questions that I always thought were quite interesting, which have a great deal of subjectivity. And that led me to Nielsen originally where my main question that I was answering on a day-to-day basis, what was, what makes a great ad? Uh, what I found though is that advertising at least, especially as it pertains to TV, is really where brands were moving away from and a lot of the real consumer analytics that people were looking for were trying to underpin people in their natural environment, particularly on social media. And I hadn't seen any company that had done it well. Uh, and I happened to meet social standards during my time at Nielsen and was truly just blown away with this ability to essentially take a large input of conversations that people were happening or happening, I should say, and bring some sort of structure to them to actually be able to analyze them and understand what people were talking about as it pertained to different types of topics. And so I think that's really what brought me here was the fascination with this huge amount of data behind the ways that people were talking about on social. And the fact that it had some structure to it, which actually allowed for real analytics to be put behind it.



    Curtis: It's a hard thing to do though. Right? You know, to answer this question of how do we extract real value or real insight from social media and you'd mentioned historically or up to this point, companies that that are trying to do that missed the mark.

  • Sometimes AI and deep learning are not only overkill, but also a subpar solution. Learn when to use them and when not. Diego from Northwestern's Deep Learning Institute discusses practical AI and deep learning in industry. He covers insights on how to train models well, the difference between textbook and real AI problems, and the problem of multiple explanations.



    Diego Klabjan: One aspect of the problem it has to have in order to be, to be amenable to AI is complexity, right? So if you have, if you have a nice data with, I don't know, 20, 30 features that you can quote, put in a spreadsheet, right? So then, then AI is going to be an overkill and it's actually sort of not, is going to be an overkill. It's going to be a subpar solution.



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    We’d like to hear what you want to learn on our future podcast episodes, and so we’re running a give away until our next podcast episode comes out. We’re giving away our book Simple Predictive Analytics. All you have to do is go on to LinkedIn and tag The Data Crunch Corporation in a post with your suggestion, and we’ll randomly pick a winner from those who submit. If you win and you’re in the US, we’ll send you a physical copy, and if you’re in another country, we’ll send you an electronic copy. Can’t wait to hear from you.



    Today, we chat with Professor Diego Klabjan the director of the Master of Science in Analytics and director of the Deep Learning Lab at Northwestern University.



    Diego: My name is Diego Klabjan. So I'm a faculty at Northwestern University in the department of industrial engineering and management sciences. I actually spend my entire career in academia. So I graduated from Georgia tech in '99, and then I spent six years at the university of Illinois Urbana-Champaign and got my tenure there. And then I was recruited here at Northwestern as a tenured faculty member a year later. So I'm at Northwestern for approximately 14 years. Yeah, so I'm the director of the master of science in analytics, actually founding director of the master of science in analytics, so I established the master's program back in 2010, and I'm directing it since then. And recently, I also became the director of the center for deep learning, which is a relatively new initiative at Northwestern. Sort of we, we are having discussions for the last year and a half, and about half a year ago, we officially kicked it off with a few founding members.



    So my expertise is in machine learning and deep learning. So I have, I run sort of a very big research program. So I advise more than 15 PhD students from a variety of, of departments and the vast majority of them do deep learning research. Yeah, so I started, I started deep learning what was around six, seven years ago. So I was definitely not sort of one of the, one of the early or the earliest faculty members conducting, studying, being attached to deep learning. But I wasn't that late to the game either. Right. So I still, I still remember approximately six, seven years ago attending deep learning conferences with like 50 attendees, and now, now those conferences are like 5,000 people. Just astonishing.



    Curtis: That's crazy. How you've seen that grow.



    Diego: Yup. Um, yeah, and I'm also, so the last word is ah, I'm also a founder of OPEX analytics, which is a consulting company. I no longer have much to do with the company, uh, but sort of have experience also on the business side.



    Curtis: Great. So this, uh, the deep learning Institute started about a year or two ago, is that right? Did I understand that right?



    Diego: Yeah, that's correct. I mean, so we,

  • Luciano Pesci is bullish on blockchain and data science. Since blockchain offers a complete historical record, no one can delete or alter prior information written into the record. He sees this characteristic as a massive advantage for data scientists. 



    Luciano Pesci: And the key for data scientists and leaders who are gonna oversee data sciences, you've got to get a narrow enough problem to demonstrate one quick win and I mean in 90 days. If in 90 days you can't come back to the organization and show, "we have made real progress on these metrics in your understanding so that you can make these decisions," they're not going to continue to do it.



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    Ginette: No matter what your position in a company is, knowing about data, how it works, and what it can do for you is vital to the success of your organization.



    Fortunately there are ways for you and those in your organization to learn about data. Brilliant dot org, an online educational resource, has on-demand classes in data basics that can help you understand this growing area, providing you with tools and the framework you need to break up complex concepts into bite-sized chunks. You can sign up for free, preview courses, and start learning by going to Brilliant.org/DataCrunch, and also the first 200 people that go to that link will get 20% off the annual premium subscription.







    Ginette: The CEO of Emperitas, Luciano Pesci, joins us today. Let’s get right into the episode.



    Curtis: What inspired you to get into data? What inspired you to to start the company you're working at now and how'd you get going?



    Luciano: All of it was a complete accident. Yeah, none of it, not the schooling, the business, none of it was intentional.



    Curtis: Okay, let's hear about it.



    Luciano: My first business was actually recording studio and a record label, and I had signed, among other acts, my own band, and we got a management deal, and we went to LA. We started to tour with national acts, and I thought that was going to be my career path without a doubt, and so I didn't take the ACT/SAT at the time, barely graduated high school, and then the band fell apart. And I was like, "well, what am I going to do?" So I went back to school, had a transformative experience, got drawn into economics, and then within economics really found data.



    Curtis: And what drew you to economics?



    Luciano: I like studying people. I think it's the most complete picture of people. So there's a lot of other disciplines that sort of dive deeper when it comes to people's psychological characteristics, their behavioral components. But economics was about the entire system and how an individual functions within that bigger system. And the reason I got to data from that was that the key assumption of modern economics is perfect information. So this is usually where critics of what is called the classical model in economics come in and say, "well, you can't have perfect information, so therefore you can't have optimizing behavior." And one of the beautiful lessons of the last 20 years, especially with data science is it might not be perfect information, but you can get really good information to make optimized choices. And so the represented that, that method of going into the real world and optimizing all these processes that we were learning about in the textbooks and at the abstract theory level.



    Curtis: Interesting. And that's, there's not a lot of places, if any, that I know of that teach that approach, right? Or have good coursework around that. Did you kind of figure this out on your own or how'd you, how'd you come to that?



  • There have been some spectacular fails when it comes to looking at Internet traffic, think Google Flu Trends; however, Predata, a company that helps people understand global events and market moves by interpreting signals in Internet traffic, has honed human-in-the-loop machine learning to get to the bottom of geopolitical risk and price movement.



    Predata uncovers predictive behavior by applying machine learning techniques to online activity. The company has built the most comprehensive predictive analytics platform for geopolitical risk, enabling customers to discover, quantify and act on dynamic shifts in online behavior. The Predata platform provides users with quantitative measurements of digital concern and predictive indicators for different types of risk events for any given country or topic.



    Dakota Killpack: Over the past few years, we’ve have collected a very large annotated data set about human judgment for how relevant many, many pieces of web content are to various tasks.



    Ginette Methot: I’m Ginette,



    Curtis Seare: and I’m Curtis,



    Ginette: and you are listening to Data Crunch,



    Curtis: a podcast about how applied data science, machine learning, and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    Let’s jump into our episode today with the director of Machine Learning at Predata.



    Dakota: My name is Dakota Killpack and I'm the director of machine learning at
    Predata, and Predata is a company that using machine learning to look at the,
    the spectrum of human behavior online organizes it into useful signals about
    people's attention and we use those to influence how people make decisions by
    giving them a factor of what people are paying attention to. Because attention
    is a scarce cognitive resource. People tend to pay attention only to very
    important things, If they're about to act in a way that might cause problems
    for our potential clients, they'll, they'll spend a lot of time online doing
    research, making preparations, and by unlocking this attention dimension to web
    traffic, we're able to give some unique insights to our clients.



    Curtis: Can we jump into maybe a concrete use case into what you're talking
    about just to frame and put some details around how someone might use that
    service?



    Dakota: Absolutely. So one example that I find particularly useful for
    revealing how attention works online is looking at what soybean farmers did in
    response to a tariffs earlier this year. So knowing that the, they weren't
    going to get a very good price on soybeans at that particular moment. A lot of
    them were looking up how to store their grain online and purchasing these very
    long grain storage bags, purchasing some obscure scientific equipment needed to
    insert big needles into the bags to get a sample for testing the soybeans and
    moisture testing devices to make sure they wouldn't grow mold. And all of these
    webpages are things that tend to get very little traffic. And when we see an
    increase in traffic to all of them, at the same time, we know that a, a very
    influential group of individuals, namely farmers, is paying attention to this
    topic. Using that we're able to give early warning to our clients.



    Curtis: Sounds like looking for needles in a haystack of data. Right? So how do
    you determine what is a useful bit of information in the context of what your
    clients are looking for? Do they kind of have an idea of what you're looking
    for and then you'd go out and search for that or, or does your algorithm find
    anomalies in the data and then characterize those anomalies so that you can
    then report that back? How does it work?



    Dakota: It’s a mix of both. Because the, the Internet is such a rich and
    complex domain. It's, it's very dangerous to just look for anomalies at scale.
    There there've been some high profile failures, most notably the Google Flu Trends

  • The way you organize your data science team will greatly affect your business’s outcome. This episode discusses different structures for a data science team, as well as top down versus bottom up approaches, how to get data science solutions into production organically, and how to be part of the business while remaining in contact with other data scientists on the team.



    Mark Lowe: Having lived through small scale, two people working, to large scale, thousands of people in your organization, the way that you organize the data science team has dramatic effect on its productivity.



    Ginette Methot: I’m Ginette, and I’m Curtis, and you are listening to Data Crunch, a podcast about how applied data science, machine learning, and artificial intelligence are changing the world. Data Crunch is produced by the Data Crunch Corporation, an analytics training and consulting company.



    Building effective data science processes is tough. Mode, the data science platform, has compiled three tips to make it a bit easier: don’t over plan, there’s no one process that fits everyone, and waste time. That’s right. Waste time. Read more at mode.com/dsp M O D E.com/D S P.



    Today we’re going to talk about effective ways you can organize your data science team, and we’ll hear lots of great insights from our guest. Let’s get to it.



    Mark: My name is Mark Lowe. I’m currently the senior principal data scientist here at Valassis.



    Curtis Seare: Describe just a little bit about what Valassis does.



    Mark: So we work with pretty much every major manufacturer retailer in the U.S. Our work kind of runs the gamut in terms of solving problems for them in terms of how do I influence customers. And so we manage a lot of print products that go reach every household, every week and of course a lot of digital products. So everything from display advertising, campaign, search campaign, social. Pretty much any distribution mechanism that can influence customers, we try to use those channels.



    Curtis: And in working on these problems we talked a little bit about earlier what the approaches for data science. Some people try to bin it in a software development kind of a role, an agile role, and how that usually doesn’t work for data science cause it’s more of an experimental type of a thing. Can you comment on its similarities and differences and how you should be approaching data sites?



    Mark: I think that’s a great question. Honestly, if you, if you asked me 10 years ago if this was an interesting question, I would have found it very boring. But having, having lived through small-scale, two people working, to large scale, thousands of people in your organization, the way that you organize the data science team has dramatic effect on its productivity, and there’s no one size that fits all. Honestly, you kind of have to cater the organization of the data science team to where the company is. For example, the two common models that are deployed and, and we’ve, we’ve lived in both of them is kinda thinking about data science as an internal consulting group. So I have a a pool of data scientists. Stakeholders throughout the company come to me and ask, they say, “I have this problem. I think it needs data science” and then the data science lead or team.



    Yes, we do need a data scientist working on that. Here’s a person with that specialty. So kind of farming out individuals on the team to solve particular problems. So it’s a fairly centralized organization and that, you know, there’s a lot of benefits to that. One, you’ve got strong sense of community as a team. Oftentimes you’re very tightly organized together. You function as a data science unit. You can try to make sure that you’re putting the right skillset for the right problem. As you know, as you’ve talked to that, there’s, there is no one definition of data science, there’s no one skillset. So oftentimes the data science team has a mixture of skills across the team,

  • David Millar is a man bringing analytical solutions to an industry that historically has had little data. But with the explosion of smart devices, that is all changing, and the way utilities operate is as well.



    David Millar: The way that electricity markets work is that you have what's called the day ahead market. And so the day before, let's say one o'clock tomorrow, markets run, and this is a big optimization problem.



    Ginette Methot: I'm Ginette



    Curtis Seare: And I'm Curtis



    Ginette: And you are listening to Data Crunch,



    Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world.



    Ginette: Data Crunch is produced by the Data Crunch Corporation and analytics training and consulting company.



    Ginette: The father of lean startup methodology once said “There are no facts inside the building so get the heck outside.”



    The utilities industry is no different. Sometimes the facts that’ll make your machine learning career are waiting just outside your office.



    Read more at mode.com/MLutilities. m o d e dot com slash M L utilities. 



    Ginette: David Millar is a man bringing analytical solutions to an industry that historically has had little data. But with the explosion of smart devices, that's all changing, and the way utilities operate is as well. Let's get into it.



    David: I'm, ah, Dave Millar. I am the director of resource planning consulting at Ascend Analytics where I lead the research client consulting team. And so my team and I work with utilities primarily to help them make decisions using analytics, regarding their longterm power portfolio. So primarily I read looking at we'll say we're retiring coal plants or retired, retired gas plant. What would we replace it with? Renewable energy. We need batteries. How do we approach these questions using analytics in order to help us come up with the best solution going forward.



    Curtis: You had talked a little bit about, you sent me some notes about how the, the sector that you're in, the power sector, you know, is kind of slow moving, right? It's not known for these quick changes and innovations, but you are starting to see some things that, that's gonna change this fundamentally. And so if we could jump into that and, and then get your perspective, I'd love to hear about it.



    David: Yeah, the power sector basically didn't change from the time of once they figured out that we're going to use alternating current that it didn't really change much in the past hundred years, that the model is essentially the same. You have big power stations that are far away from the load centers and then you have this transition network and flow of electricity is really one direction, right, from, from the big power plants to your home. And technology is rapidly changing that and it creates a space to becoming both more digital and more decentralized.



    So, on the digital front, we, we actually have generation technologies, that don't use anything, any spinning parts, right? so you have solar, solar power, and you have, now we're seeing more and more batteries being connected to solar. And so those are both digital technologies that are increasingly becoming this default, energy source, wind or solar and batteries and and just because the cost of the signals is have, dramatically over the past 10, 10. It's really happened over the past 10 years. And so now renewables are at parity with the more conventional sources of electricity. So gas, power and natural gas power, coal power.



    Curtis: Is that in terms of like how much energy they're currently producing parity or just effectiveness or efficiency. What is that parity?



    David: Parity in terms of costs. So, you know, as renewables drop in costs, especially as batteries drop in costs, that means that when, when I look at a problem with my clients, we're comparing, technologies that essentially have the ability, similar attributes,

  • Simeon Schwarz has been walking the data management tightrope for years. In this episode, he helps us see the hidden organizational and economic impacts that come from leading a data management initiative, and how to understand and overcome the inertia, fears, and status quo that hold good data management back.



    Simeon Schwarz: Fighting against shadow IT . . . you have to find a way to adopt it, you have to find a way to incorporate it, and you have to find a way to leverage it. You will never be able to completely eliminate it.



    Ginette Methot: I'm Ginette.



    Curtis Seare: And I'm Curtis.



    Ginette: And you are listening to Data Crunch,



    Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world.



    Ginette: This might come as a surprise to some, but......tools won’t build a data-driven culture. 



    The right people will. 



    Read more at mode.com/datadrivenculture. m o d e dot com slash data driven culture.



    Ginette: Today we speak with Simeon Schwarz. He’s been working in data management for over twenty years and owns his own consultancy, Data Management Solutions.



    Simeon: Being in the data management function, you're de facto seeing the life blood of how the business flows, how the uh, where the information goes, how the decision are made.



    Curtis: So have you been focused mainly in a, in a specific industry or have you spend a lot in your career?



    Simeon: I've started in telecom. I've built first cell phone carrier back in my home country. I worked in academia, in a retail, ecommerce, and then 10 years in financial services, most recently, and now I do insurance. So a lot of different fields.



    Curtis: So you've run the gamut. That's interesting. And now that you've done this in several different fields, do you find that the principles and your approach is basically the same or or is it different depending on the problems that you're trying to solve?



    Simeon: The approach is the same, and there are two parts to this. We'll talk about what's difficult in this role a little bit further in this conversation. The second part is you really need to understand the domain you're dealing with because, one, if we, if we're talking about data management in general, one of the key functions, one of the key challenges that you're going to be facing is establishing and building your credibility. Without knowledge of the domain. B insurance or financial services or manufacturing or any other field, you simply can't have intelligent conversations with your stakeholders in a way that would lead to good conclusions. So you will absolutely have to know the domain, which is large portion, of your value.



    Curtis: So as you've gotten into a domain that maybe you weren't as familiar with in a data role, how did you overcome this need to understand the domain better?



    Simeon: Let's step back and talk about what a data genuinely is right now and specifically talk about data management. You are running a data function or sometimes called data services because what used to be DBA teams or data analysts or various forms is really becoming a practice and looking at it as a practice. You have a certain set of clients, the are paying you for the services, you have certain amount of resources and you trying to optimize those resources to serve your clients better. So what are the challenges that you're going to face in any data management role? So you're in this interesting balance between moving forward very rapidly as well as not destroying what already exists, not destroying the services that are already provided. People have to breath, people have to be able to, to leave. You can't disrupt too much the services that already exist, your reports, your, you know, our auditing work your work with, you know, regulatory agencies. Anything else that the business needs to produce has to continue to happen. The people who are doing their jobs in the current way simil...

  • David Saben is on a mission to make taking tests less painful, and he’s using data to do it. In this episode, he’ll discuss reviving methods developed in 1979 to shorten tests and make them more effective, as well as how to use psychometrics to aid in the design and crafting of an effective test.



    David Saben: When I see my son who's 11 years old, spending three days and testing when I know there's absolutely no reason for it that you can do that in an hour.



    Ginette Methot: I'm Ginette



    Curtis Seare: And I'm Curtis



    Ginette: And you are listening to Data Crunch



    Curtis: A podcast about how applied data science, machine learning and artificial intelligence are changing the world.



    The father of lean startup methodology once said “There are no facts inside the building so get the heck outside.”



    The education industry is no different. Sometimes the facts that’ll make your machine learning career are waiting just outside your office. 



    Read more at mode.com/mledu



    m o d e dot com slash M L e d u



    Ginette: Today we chat with David Saben, the CEO and president of Assessment Systems, an organization innovating psychometrics (the science of assessment)



    Dave: I originally started my career in telecommunications, uh, bringing voice and data services into institutions and to learning institutions. And then when I realized is, is that connecting universities and for profit schools, you know, connecting them online really created a huge opportunity for learning and really crossing barriers to learn and really meeting learners on their terms with online learning courses. And that kind of brought me through this, this journey with using technology to, to really make better decisions in learning and knowledge and how we do that effectively. And that has started a about a 16 year career focused on that using using data, using e tools to make a better learning environment for everybody and make us more effective in the way that we, we gather information and retain information. And that that's left. Let brought me, um, into several areas. One is in the learning sciences is how do you, how do you deliver learning content more effectively, but also in the assessment side as well, where, how do you measure what folks are learning effectively and painlessly in that that's brought me on this, uh, this journey into the assessment industry and really making sure that every exam that's delivered in classrooms or whether it's a licensure exam is as fast and as fair as possible and using data to be able to do that.



    So really mitigating the risk of human bias when it comes to measuring a human's abilities, uh, which is, uh, which is a troublesome area, right?



    Curtis: Yeah. And now you say a effective and, and painless. And I know most people hate taking tests, so, so tell me how you approach that.



    Dave: Yeah. Well, I think there's a lot of ways. I mean, I think one of the, one of the most important ways is that you make the test faster, right? You make, you know, in 1979, I was the chairman of assessment systems help create a technology called computerized adaptive testing. What that uses, it uses algorithms to gauge what you know and what you don't know and then basically tailoring the content that you see, the next item you see gets more progressively difficult or progressively easier depending on your, your ability. And what that does is that reduces test time by about 50%. We see that with the ASVAB exam that's given to our service men and women to make their testing experience faster and fair and really, and we're starting to see that really across the world with measurements. So really making those exams tailored to the person's ability, uh, which is really, really important.



    You know, what you don't want to do is you don't want to give one test that doesn't change to everyone cause that's really, really inefficient. You know, if I'm going through the test and I know I know the content really well,