Schedule (tentative)

Here is a tentative schedule for the semester. There may be changes in the schedule depending on the pace of the class. Paper list for Part II and Debate topics for Part III will be finalized by the third fifth week of class. In the meanwhile, you can see syllabus for a partial paper list to get a sense of what advanced topics and case studies we will cover.

Jan 21: Introduction

Basic Concepts

Jan 23: Computer Vision: Single and Multi-view Geometry

Jan 28: Computer Vision: Recognition

Jan 30: Computer Vision: Generative Models

Feb 4: Robotics: Forward / Inverse Kinematics

Feb 6: Robotics: Motion Planning, Feedback Control

Feb 11: MDPs: Bellman Equations, Policy Iteration / Evaluation, Value Iteration

Feb 13: MDPs: Model-free Policy Evaluation and Control

Feb 18: MDPs: Policy Gradients

Feb 20: Deep RL

Feb 25: No class

Instead attend relevant sessions at the CSL Student Conference

Feb 27: Imitation Learning

Mar 4: Inverse RL

Advanced Concepts and Case Studies

Mar 6: Perception

  • DUST3R: Geometric 3D vision made easy. Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, Jerome Revaud. CVPR, 2024.

  • Presentation Questions:

    • Q1: What problem does the paper tackle and how?

    • Q2: What is the significance of the results? What does the paper do that was not possible before?

    • Q3: What part of the system matters most for the performance?

    • Q4: What impact can this paper have on robotics? What has already been done, and what can be done that hasn't already been done?

    • Q5: What are some limitations of the system and what would be some extensions?

Mar 11: Perception

  • CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos. Nikita Karaev, Iurii Makarov, Jianyuan Wang, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht. arxiv, 2024.

  • Presentation Questions:

    • Q1: What problem does the paper tackle, and at a high-level how did earlier (CoTracker: It is Better to Track Together. Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht. ECCV, 2024.) tackle this problem?

    • Q2: How does prior work LocoTrack (Local All-Pair Correspondence for Point Tracking. Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim, Joon-Young Lee. ECCV, 2024.) tackle this problem?

    • Q3: What does CoTracker3 do? What part of the contributions matters most for the performance?

    • Q4: What impact can this paper have on robotics? What has already been done, and what can be done that hasn't already been done?

    • Q5: What are some limitations of the system and what would be some extensions?

Mar 13: Imitation Learning

  • Diffusion policy: Visuomotor policy learning via action diffusion. Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, Shuran Song. The International Journal of Robotics Research, 2023.

  • Presentation Questions:

    • Q1: What problem does the paper tackle and what innovations does the paper propose?

    • Q2: What experiments does the paper conduct to test these innovations? What aspect of the innovation contributes most to the overall performance?

    • Q3: What is ACT (Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. Tony Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn. Robotics: Science and Systems, 2023.) and how does Diffusion Policy compare to ACT? For what problems will you use Diffusion Policy and for what problems will you use ACT?

    • Q4: What is VQ-BeT (Behavior generation with latent actions. Seungjae Lee, Yibin Wang, Haritheja Etukuru, H Kim, Nur Shafiullah, Lerrel Pinto. ICML, 2024.) and how does Diffusion Policy compare to VQ-BeT? For what problems will you use Diffusion Policy and for what problems will you use VQ-BeT?

    • Q5: What are some limitations of Diffusion Policy? How can they be mitigated?

Mar 18: Spring Break (no class)

  • Enjoy

Mar 20: Spring Break (no class)

  • Enjoy

Mar 25: Imitation Learning

  • 3D diffuser actor: Policy diffusion with 3D scene representations. Tsung-Wei Ke, Nikolaos Gkanatsios, Katerina Fragkiadaki. Conference on Robot Learning (CoRL), 2024.

  • Presentation Questions:

    • Q1: Describe how 3D Diffuser Actor works (i.e. proposed architecture, output representation, etc)?

    • Q2: How do RVT / RTV2 tackle this problem, and how does 3D Diffuser Actor compare to RVT / RVT2?

    • Q3: How well does 3D Diffuser Actor work? What aspect of 3D Diffuser Actor, as discussed in the paper or in your opinion, contribute most to its overall performance?

    • Q4: What additional experiments would you have liked to see and why? What are some limitations of 3D Diffuser Actor and how could they be mitigated?

    • Q5: For real world imitation learning, what tasks and settings will 3D diffuser actor be particularly good for? What tasks and settings will be particularly challenging for 3D Diffuser Actor?

Mar 27: Scaling up Imitation Learning

  • Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots. Cheng Chi, Zhenjia Xu, Chuer Pan, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, Russ Tedrake, Shuran Song. Robotics: Science and Systems, 2024.

  • Presentation Questions:

    • Q1: Describe the motivation and design for the UMI device.

    • Q2: Describe the considerations made while designing policies for use with the UMI device.

    • Q3: Discuss the experiments that validate the different design choices made in UMI.

    • Q4: Describe the ALOHA setup? How does UMI compare to Aloha? For what taks will you use UMI and for what problems will you use Aloha?

    • Q5: What are some limitations of UMI? What are some tasks for which UMI will not be appropriate?

Apr 1: Scaling up Imitation Learning

Apr 3: Sim2Real for Locomotion

Apr 8: Humanoids

Apr 10: Sim2Real for Dexterous Manipulation

Apr 15: Navigation

Apr 17: Robotics and Vision and Language Models

Apr 22: If DARPA Robotics Challenge were to be held today?

Debates

Apr 24: Sim2Real will solve robotics

Apr 29: Modular Learning vs End-to-end Learning

May 1: Scaling up is all you need to solve robotics

Project Presentations

May 6: