Episodes
-
This particular podcast covers the material in Chapter 1 of my new (unpublished) book âStatistical Machine Learning: A unified frameworkâ. In this episode we discuss Chapter 1 of my new book, which shows how supervised, unsupervised, and reinforcement learning algorithms can be viewed as special cases of a general empirical risk minimization framework. This is useful because it provides a framework for not only understanding existing algorithms but also for suggesting new algorithms for specific applications.
-
This particular podcast (Episode 78 of Learning Machines 101) is the initial episode in a new special series of episodes designed to provide commentary on a new book that I am in the process of writing. In this episode we discuss books, software, courses, and podcasts designed to help you become a machine learning expert! For more information, check out: www.learningmachines101.com
-
Missing episodes?
-
In this 77th episode of www.learningmachines101.com , we explain the proper semantic interpretation of the Bayesian Information Criterion (BIC) and emphasize how this semantic interpretation is fundamentally different from AIC (Akaike Information Criterion) model selection methods. Briefly, BIC is used to estimate the probability of the training data given the probability model, while AIC is used to estimate out-of-sample prediction error. The probability of the training data given the model is called the âmarginal likelihoodâ. Using the marginal likelihood, one can calculate the probability of a model given the training data and then use this analysis to support selecting the most probable model, selecting a model that minimizes expected risk, and support Bayesian model averaging. The assumptions which are required for BIC to be a valid approximation for the probability of the training data given the probability model are also discussed.
-
In this episode, we explain the proper semantic interpretation of the Akaike Information Criterion (AIC) and the Generalized Akaike Information Criterion (GAIC) for the purpose of picking the best model for a given set of training data. The precise semantic interpretation of these model selection criteria is provided, explicit assumptions are provided for the AIC and GAIC to be valid, and explicit formulas are provided for the AIC and GAIC so they can be used in practice. Briefly, AIC and GAIC provide a way of estimating the average prediction error of your learning machine on test data without using test data or cross-validation methods. The GAIC is also called the Takeuchi Information Criterion (TIC).
-
In this episode, we explore the question of what can computers do as well as what computers canât do using the Turing Machine argument. Specifically, we discuss the computational limits of computers and raise the question of whether such limits pertain to biological brains and other non-standard computing machines. This episode is dedicated to the memory of my mom, Sandy Golden. To learn more about Turing Machines, SuperTuring Machines, Hypercomputation, and my Mom, check out: www.learningmachines101.com
-
In this episode we will learn how to use ârulesâ to represent knowledge. We discuss how this works in practice and we explain how these ideas are implemented in a special architecture called the production system. The challenges of representing knowledge using rules are also discussed. Specifically, these challenges include: issues of feature representation, having an adequate number of rules, obtaining rules that are not inconsistent, and having rules that handle special cases and situations. To learn more, visit:
www.learningmachines101.com
-
This is a remix of the original second episode Learning Machines 101 which describes in a little more detail how the computer program that Arthur Samuel developed in 1959 learned to play checkers by itself without human intervention using a mixture of classical artificial intelligence search methods and artificial neural network learning algorithms. The podcast ends with a book review of Professor Nilssonâs book: âThe Quest for Artificial Intelligence: A History of Ideas and Achievementsâ. For more information, check out:
www.learningmachines101.com
-
This podcast is basically a remix of the first and second episodes of Learning Machines 101 and is intended to serve as the new introduction to the Learning Machines 101 podcast series. The search for common organizing principles which could support the foundations of machine learning and artificial intelligence is discussed and the concept of the Big Artificial Intelligence Magic Show is introduced. At the end of the podcast, the book After Digital: Computation as Done by Brains and Machines by Professor James A. Anderson is briefly reviewed. For more information, please visit: www.learningmachines101.com
-
In this podcast, we provide some insights into the complexity of common sense. First, we discuss the importance of building common sense into learning machines. Second, we discuss how first-order logic can be used to represent common sense knowledge. Third, we describe a large database of common sense knowledge where the knowledge is represented using first-order logic which is free for researchers in machine learning. We provide a hyperlink to this free database of common sense knowledge. Fourth, we discuss some problems of first-order logic and explain how these problems can be resolved by transforming logical rules into probabilistic rules using Markov Logic Nets. And finally, we have another book review of the book âMarkov Logic: An Interface Layer for Artificial Intelligenceâ by Pedro Domingos and Daniel Lowd which provides further discussion of the issues in this podcast. In this book review, we cover some additional important applications of Markov Logic Nets not covered in detail in this podcast such as: object labeling, social network link analysis, information extraction, and helping support robot navigation. Finally, at the end of the podcast we provide information about a free software program which you can use to build and evaluate your own Markov Logic Net! For more information check out: www.learningmachines101.com
-
This 70th episode of Learning Machines 101 we discuss how to identify facial emotion expressions in images using an advanced clustering technique called Stochastic Neighborhood Embedding. We discuss the concept of recognizing facial emotions in images including applications to problems such as: improving online communication quality, identifying suspicious individuals such as terrorists using video cameras, improving lie detector tests, improving athletic performance by providing emotion feedback, and designing smart advertising which can look at the customerâs face to determine if they are bored or interested and dynamically adapt the advertising accordingly. To address this problem we review clustering algorithm methods including K-means clustering, Linear Discriminant Analysis, Spectral Clustering, and the relatively new technique of Stochastic Neighborhood Embedding (SNE) clustering. At the end of this podcast we provide a brief review of the classic machine learning text by Christopher Bishop titled âPattern Recognition and Machine Learningâ.
Make sure to visit: www.learningmachines101.com
to obtain free transcripts of this podcast and important
supplemental reference materials! -
This 69th episode of Learning Machines 101 provides a short overview of the 2017 Neural Information Processing Systems conference with a focus on the development of methods for teaching learning machines rather than simply training them on examples. In addition, a book review of the book âDeep Learningâ is provided. #nips2017
-
This 68th episode of Learning Machines 101 discusses a broad class of unsupervised, supervised, and reinforcement machine learning algorithms which iteratively update their parameter vector by adding a perturbation based upon all of the training data. This process is repeated, making a perturbation of the parameter vector based upon all of the training data until a parameter vector is generated which exhibits improved predictive performance. The magnitude of the perturbation at each learning iteration is called the âstepsizeâ or âlearning rateâ and the identity of the perturbation vector is called the âsearch directionâ. Simple mathematical formulas are presented based upon research from the late 1960s by Philip Wolfe and G. Zoutendijk that ensure convergence of the generated sequence of parameter vectors. These formulas may be used as the basis for the design of artificially intelligent smart automatic learning rate selection algorithms. For more information, please visit the official website:
www.learningmachines101.com -
In this episode we discuss how to learn to solve constraint satisfaction inference problems. The goal of the inference process is to infer the most probable values for unobservable variables. These constraints, however, can be learned from experience. Specifically, the important machine learning method for handling unobservable components of the data using Expectation Maximization is introduced. Check it out at:
www.learningmachines101.com
-
In this episode of Learning Machines 101 (www.learningmachines101.com) we discuss how to solve constraint satisfaction inference problems where knowledge is represented as a large unordered collection of complicated probabilistic constraints among a collection of variables. The goal of the inference process is to infer the most probable values of the unobservable variables given the observable variables. Specifically, Monte Carlo Markov Chain ( MCMC ) methods are discussed.
-
In this episode rerun we introduce the concept of gradient descent which is the fundamental principle underlying learning in the majority of deep learning and neural network learning algorithms. Check out the website:
www.learningmachines101.com
to obtain a transcript of this episode!
-
In this rerun of episode 24 we explore the concept of evolutionary learning machines. That is, learning machines that reproduce themselves in the hopes of evolving into more intelligent and smarter learning machines. This leads us to the topic of stochastic model search and evaluation. Check out the blog with additional technical references at: www.learningmachines101.com
-
This 63rd episode of Learning Machines 101 discusses how to build reinforcement learning machines which become smarter with experience but do not use this acquired knowledge to modify their actions and behaviors. This episode explains how to build reinforcement learning machines whose behavior evolves as the learning machines become increasingly smarter. The essential idea for the construction of such reinforcement learning machines is based upon first developing a supervised learning machine. The supervised learning machine then âguessesâ the desired response and updates its parameters using its guess for the desired response! Although the reasoning seems circular, this approach in fact is a variation of the important widely used machine learning method of Expectation-Maximization. Some applications to learning to play video games, control walking robots, and developing optimal trading strategies for the stock market are briefly mentioned as well. Check us out at: www.learningmachines101.com
-
This 62nd episode of Learning Machines 101 (www.learningmachines101.com) discusses how to design reinforcement learning machines using your knowledge of how to build supervised learning machines! Specifically, we focus on Value Function Reinforcement Learning Machines which estimate the unobservable total penalty associated with an episode when only the beginning of the episode is observable. This estimated Value Function can then be used by the learning machine to select a particular action in a given situation to minimize the total future penalties that will be received. Applications include: building your own robot, building your own automatic aircraft lander, building your own automated stock market trading system, and building your own self-driving car!!
-
This is the third of a short subsequence of podcasts providing a summary of events associated with Dr. Goldenâs recent visit to the 2015 Neural Information Processing Systems Conference. This is one of the top conferences in the field of Machine Learning. This episode reviews and discusses topics associated with the Introduction to Reinforcement Learning with Function Approximation Tutorial presented by Professor Richard Sutton on the first day of the conference.
This episode is a RERUN of an episode originally presented in January 2016 and lays the groundwork for future episodes on the topic of reinforcement learning!
Check out: www.learningmachines101.com for more info!!
-
This 60th episode of Learning Machines 101 discusses how one can use novelty detection or anomaly detection machine learning algorithms to monitor the performance of other machine learning algorithms deployed in real world environments. The episode is based upon a review of a talk by Chief Data Scientist Ira Cohen of Anodot presented at the 2016 Berlin Buzzwords Data Science Conference. Check out: www.learningmachines101.com to hear the podcast or read a transcription of the podcast!
- Show more