Episodes

  • In this episode of Computer Vision Decoded, we are going to dive into our in-house computer vision expert's reaction to the iPhone 15 and iPhone 15 Pro announcement.

    We dive into the camera upgrades, decode what a quad sensor means, and even talk about the importance of depth maps.

    Episode timeline:

    00:00 Intro
    02:59 iPhone 15 Overview
    05:15 iPhone 15 Main Camera
    07:20 Quad Pixel Sensor Explained
    15:45 Depth Maps Explained
    22:57 iPhone 15 Pro Overview
    27:01 iPhone 15 Pro Cameras
    32:20 Spatial Video
    36:00 A17 Pro Chipset

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we are going to dive into Pierre Moulon's 10 years experience building OpenMVG. We also cover the impact of open-source software in the computer vision industry and everything involved in building your own project. There is a lot to learn here!

    Our episode guest, Pierre Moulon, is a computer vision research scientist and creator of OpenMVG - a library for computer-vision scientists and targeted for the Multiple View Geometry community.

    The episode follow's Pierre's journey building OpenMVG which he wrote about as an article in his GitHub repository.

    Explore OpenMVG on GitHub: https://github.com/openMVG/openMVG
    Pierre's article on building OpenMVG: https://github.com/openMVG/openMVG/discussions/2165

    Episode timeline:

    00:00 Intro
    01:00 Pierre Moulon's Background
    04:40 What is OpenMVG?
    08:43 What is the importance of open-source software for the computer vision community?
    12:30 What to look for deciding to use an opensource project
    16:27 What is Multi View Geometry?
    24:24 What was the biggest challenge building OpenMVG?
    31:00 How do you grow a community around an open-source project
    38:09 Choosing a licensing model for your open-source project
    43:07 Funding and sponsorship for your open-source project
    46:46 Building an open-source project for your resume
    49:53 How to get started with OpenMVG

    Contact:
    Follow Pierre Moulon on LinkedIn: https://www.linkedin.com/in/pierre-moulon/
    Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
    Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • Missing episodes?

    Click here to refresh the feed.

  • In this episode of Computer Vision Decoded, we are going to dive into implicit neural representations.

    We are joined by Itzik Ben-Shabat, a Visiting Research Fellow at the Australian National Universit (ANU) and Technion – Israel Institute of Technology as well as the host of the Talking Paper Podcast.

    You will learn a core understanding of implicit neural representations, key concepts and terminology, how it's being used in applications today, and Itzik's research into improving output with limit input data.

    Episode timeline:

    00:00 Intro
    01:23 Overview of what implicit neural representations are
    04:08 How INR compares and contrasts with a NeRF
    08:17 Why did Itzik pursued this line of research
    10:56 What is normalization and what are normals
    13:13 Past research people should read to learn about the basics of INR
    16:10 What is an implicit representation (without the neural network)
    24:27 What is DiGS and what problem with INR does it solve?
    35:54 What is OG-I NR and what problem with INR does it solve?
    40:43 What software can researchers use to understand INR?
    49:15 What information should non-scientists be focused to learn about INR?

    Itzik's Website: https://www.itzikbs.com/
    Follow Itzik on Twitter: https://twitter.com/sitzikbs
    Follow Itzik on LinkedIn: https://www.linkedin.com/in/yizhak-itzik-ben-shabat-67b3b1b7/
    Talking Papers Podcast: https://talking.papers.podcast.itzikbs.com/

    Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
    Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85

    Referenced past episode- What is CVPR: https://share.transistor.fm/s/15edb19d

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we are going to dive into 4 different ways to 3D reconstruct a scene with images. Our cohost Jared Heinly, a PhD in the computer science specializing in 3D reconstruction from images, will dive into the 4 distinct strategies and discuss the pros and cons of each.

    Links to content shared in this episode:

    Live SLAM to measure a stockpile with SR Measure: https://srmeasure.com/professional

    Jared's notes on the iPhone LiDAR and SLAM: https://everypoint.medium.com/everypoint-gets-hands-on-with-apples-new-lidar-sensor-44eeb38db579

    How to capture images for 3D reconstruction: https://youtu.be/AQfRdr_gZ8g

    00:00 Intro
    01:30 3D Reconstruction from Video
    13:48 3D Reconstruction from Images
    28:05 3D Reconstruction from Stereo Pairs
    38:43 3D Reconstruction from SLAM

    Follow Jared Heinly
    Twitter: https://twitter.com/JaredHeinly
    LinkedIn https://www.linkedin.com/in/jheinly/

    Follow Jonathan Stephens
    Twitter: https://twitter.com/jonstephens85
    LinkedIn: https://www.linkedin.com/in/jonathanstephens/

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • Join our guest, Keith Ito, founder of Scaniverse as we discuss the challenges of creating a 3D capture app for iPhones. Keith goes into depth on balancing speed with quality of 3D output and how he designed an intuitive user experience for his users.

    In this episode, we discuss…

    01:00 - Keith's Ito's background at Google09:44 - What is the Scaniverse app11:43 - What inspired Keith to build Scaniverse17:37 - The challenges of using LiDAR in the early versions of Scaniverse25:54 - How to build a good user experience for 3D capture apps32:00 - The challenges of running photogrammetry on an iPhone37:07 - The future of 3D capture40:57 - Scaniverse's role at Niantic

    Learn more about Scaniverse at: https://scaniverse.com/
    Follow Keith Ito on Twitter at: https://twitter.com/keeeto

    Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
    Follow Jonathan Stephens on Twitter: https://twitter.com/jonstephens85
    Follow Jonathan Stephens on LinkedIn: https://www.linkedin.com/in/jonathanstephens/

    -----

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we are going to dive into one of the hottest topics in the industry: Neural Radiance Fields (NeRFs)

    We are joined by Matt Tancik, a student pursuing a PhD in the computer science and electrical engineering department at UC Berkeley. He has also contributed research to the original NeRF project in 2020 along with several others since then.

    Last but not least, he is building NeRFStudio - a collaboration friendly studio for NeRFs.

    In this episode you will learn about what NeRFs are and more importantly what they are not. Matt goes into the challenges of large scale NeRF creation with his experience with Block-NeRF.

    Follow Matt's work at https://www.matthewtancik.com/

    Get started with Nerfstudio here: https://docs.nerf.studio/en/latest/

    Block-NeRF details: https://waymo.com/research/block-nerf/

    00:00 Intro
    00:45 Matt’s Background Into NeRF Research
    04:00 What is a NeRF and how it is different from photogrammetry
    11:57 Can geometry be extracted from NeRFs?
    15:30 Will NeRFs supersede photogrammetry in the future?
    22:47 Block-NeRF and the pros and cons of using 360 cameras
    25:30 What is the goal of Block-NeRF
    30:44 Why do NeRFs need large GPUs to compute?
    35:45 Meshes to simulate NeRF visualizations
    40:28 What is Nerfstudio?
    47:40 How to get started with Nerfstudio

    Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
    Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we are going to dive into image capture best practices for 3D reconstruction.

    At the end of this livestream, you will have learned the basics for capturing scenes and objects. We will also provide a downloadable visual guide for reference on your next 3D reconstruction project.

    Download the official guide here to follow along: https://tinyurl.com/4n2wspkn

    00:00 Intro
    04:40 Camera motion overview
    07:15 Good camera motions
    18:43 Transition camera motions
    30:39 Bad camera motions
    39:27 How to combine camera motions
    49:16 Loop Closure
    57:42 Image Overlap
    1:14:00 Lighting and camera gear

    Watch out episode of Computer Vision in the Wild to learn more about capturing images outside and in busy locations: https://youtu.be/FwVBR6KFjPI

    Follow Jared Heinly on Twitter: https://twitter.com/JaredHeinly
    Follow Jonathan Stephens on Twitter at: https://twitter.com/jonstephens85

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we join Jared Heinly and Jonathan Stephens from EveryPoint for their live reaction to the iPhone 14 series announcement. They go in depth into what all the camera specs mean to the average person. We also explain basics of computational photography and how Apple is able to get great photos from a small camera sensor.

    00:00 Intro
    02:43 Apple Watch Review
    06:58 Airpods Pro Review
    09:40 iPhone 14 Initial Reaction
    15:05 iPhone 14 Camera Specs Breakdown
    37:13 iPhone 14 Pro Initial Reaction
    40:47 iPhone 14 Pro Camera Specs Breakdown

    Follow Jared Heinly on Twitter
    Follow Jonathan Stephens on Twitter

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded, we sit down with Jared Heinly, Chief Scientist at EveryPoint, to discuss 3D reconstruction in the wild. What does “in the wild” mean? This means 3D reconstructing objects and scenes in non-controlled environments where you may have limitations with lighting, access, reflective surfaces, etc.

    00:00 Intro
    01:30: What are Duplicate Scene Structures and How to Avoid Them
    14:30: How Jared used 100 million crowdsourced photos to 3d reconstruct 12,903 landmarks
    27:10: The benefits of capturing video for 3D reconstruction
    31:30: The benefits of using a drone to capture stills for 3D reconstruction
    34:20: Considerations for using installed cameras for 3d reconstruction
    38:30: How to work with sun issues
    44:25: Determining how far from the object you should be when capturing images
    50:35: How to capture objects with reflective surfaces
    53:40: How work around scene obstructions
    57:20: What cameras you should use

    Jared Heinly’s Academic Papers and Projects

    Paper: Correcting the Duplicate Scene Structure In Sparse 3D Reconstruction
    Project: Reconstructing the World in Six Days
    Video: Reconstructing the world in Six Days

    Follow Jared Heinly on Twitter
    Follow Jonathan Stephens on Twitter

    This episode is brought to you by EveryPoint. Learn more about how EveryPoint is building an infinitely scalable data collection and processing platform for the next generation of spatial computing applications and services: https://www.everypoint.io

  • In this episode of Computer Vision Decoded we dive into Jared Heinly's recent trip to the CVPR Conference. We cover: what the conference about, who should attend, what are the emerging trends in computer vision, how machine learning is being used in 3D reconstruction, and what NeRFs are for.

    00:00 - Introduction
    00:36 - What is CVPR?
    02:49 - Who should attend CVPR?
    08:11 - What are emerging trends in Computer Vision?
    14:34 - What is the value of NeRFs?
    20:55 - How should you attend as a non-scientist or academic?

    Follow Jared Heinly on Twitter
    Follow Jonathan Stephens on Twitter

    CVPR Conference

    Episode sponsored by: EveryPoint

  • In this inaugural episode of Computer Vision Decoded we dive into the recent announcements at WWDC 2022 and find out what they mean for the computer vision community. We talk about what Apple is doing with their new RoomPlan API and how computer vision scientists can leverage it for better experiences. We also cover the enhancements to video and photo capture during an active ARKit Session.

    00:00 - Introduction
    00:25 - Meet Jared Heinly
    02:10 - RoomPlan API
    06:23 - Higher Resolution Video with ARKit
    09:17 - The importance of pixel size and density
    13:13 - Copy and Paste Objects from Photos
    16:47 - CVPR Conference Overview

    Follow Jared Heinly on Twitter
    Follow Jonathan Stephens on Twitter

    Learn about RoomPlan API Overview
    Learn about ARKit 6 Highlights
    CVPR Conference

    Episode sponsored by: EveryPoint