Schedule (tentative)
Here is a tentative schedule for the semester. There may be changes in the
schedule depending on the pace of the class. Paper list for Part II and Debate
topics for Part III will be finalized by the third fifth week of class. In
the meanwhile, you can see syllabus for a partial paper list
to get a sense of what advanced topics and case studies we will cover.
Jan 21: Introduction
Basic Concepts
Jan 23: Computer Vision: Single and Multi-view Geometry
Readings: Szeliski 2.1, 11
Jan 28: Computer Vision: Recognition
Readings: Bishop and Bishop 1, 4, 6, 7, 8, 10, 12
Jan 30: Computer Vision: Generative Models
Readings: Bishop and Bishop 20
Feb 4: Robotics: Forward / Inverse Kinematics
Feb 6: Robotics: Motion Planning, Feedback Control
Feb 11: MDPs: Bellman Equations, Policy Iteration / Evaluation, Value Iteration
Notes: pdf
Readings: Sutton and Barto Chapter 3, 4
Feb 13: MDPs: Model-free Policy Evaluation and Control
Notes: pdf
Readings: Sutton and Barto Chapter 5, 6
Feb 18: MDPs: Policy Gradients
Notes: pdf
Readings: Sutton and Barto Chapter 13
Feb 20: Deep RL
Notes: pdf
Readings:
Human-level control through deep reinforcement learning. Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei Rusu, Joel Veness, Marc Bellemare, Alex Graves, Martin Riedmiller, Andreas Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis. Nature, 2015.
Rainbow: Combining improvements in deep reinforcement learning. Matteo Hessel, Joseph Modayil, Hado Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver. AAAI, 2018.
Continuous control with deep reinforcement learning. Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra. ICLR, 2016.
Off-policy deep reinforcement learning without exploration. Scott Fujimoto, David Meger, Doina Precup. ICML, 2018.
Trust Region Policy Optimization. John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan, Pieter Abbeel. ICML, 2015.
Proximal policy optimization algorithms. John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov. arXiv preprint arXiv:1707.06347, 2017.
Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine. ICML, 2018.
Feb 25: No class
Instead attend relevant sessions at the CSL Student Conference
Feb 27: Imitation Learning
Readings:
A reduction of imitation learning and structured prediction to no-regret online learning. Stephane Ross, Geoffrey Gordon, Drew Bagnell. AISTATS, 2011.
Mar 4: Inverse RL
Readings:
Generative adversarial imitation learning. Jonathan Ho and Stefano Ermon. NeurIPS, 2016.
Algorithms for inverse reinforcement learning.. Andrew Ng and Stuart Russell. ICML, 2000.
Apprenticeship learning via inverse reinforcement learning. Pieter Abbeel and Andrew Ng. ICML, 2004.
Advanced Concepts and Case Studies
Mar 6: Perception
DUST3R: Geometric 3D vision made easy. Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, Jerome Revaud. CVPR, 2024.
Presentation Questions:
Q1: What problem does the paper tackle and how?
Q2: What is the significance of the results? What does the paper do that was not possible before?
Q3: What part of the system matters most for the performance?
Q4: What impact can this paper have on robotics? What has already been done, and what can be done that hasn't already been done?
Q5: What are some limitations of the system and what would be some extensions?
Mar 11: Perception
CoTracker3: Simpler and Better Point Tracking by Pseudo-Labelling Real Videos. Nikita Karaev, Iurii Makarov, Jianyuan Wang, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht. arxiv, 2024.
Presentation Questions:
Q1: What problem does the paper tackle, and at a high-level how did earlier (CoTracker: It is Better to Track Together. Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, Christian Rupprecht. ECCV, 2024.) tackle this problem?
Q2: How does prior work LocoTrack (Local All-Pair Correspondence for Point Tracking. Seokju Cho, Jiahui Huang, Jisu Nam, Honggyu An, Seungryong Kim, Joon-Young Lee. ECCV, 2024.) tackle this problem?
Q3: What does CoTracker3 do? What part of the contributions matters most for the performance?
Q4: What impact can this paper have on robotics? What has already been done, and what can be done that hasn't already been done?
Q5: What are some limitations of the system and what would be some extensions?
Mar 13: Imitation Learning
Diffusion policy: Visuomotor policy learning via action diffusion. Cheng Chi, Zhenjia Xu, Siyuan Feng, Eric Cousineau, Yilun Du, Benjamin Burchfiel, Russ Tedrake, Shuran Song. The International Journal of Robotics Research, 2023.
Presentation Questions:
Q1: What problem does the paper tackle and what innovations does the paper propose?
Q2: What experiments does the paper conduct to test these innovations? What aspect of the innovation contributes most to the overall performance?
Q3: What is ACT (Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware. Tony Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn. Robotics: Science and Systems, 2023.) and how does Diffusion Policy compare to ACT? For what problems will you use Diffusion Policy and for what problems will you use ACT?
Q4: What is VQ-BeT (Behavior generation with latent actions. Seungjae Lee, Yibin Wang, Haritheja Etukuru, H Kim, Nur Shafiullah, Lerrel Pinto. ICML, 2024.) and how does Diffusion Policy compare to VQ-BeT? For what problems will you use Diffusion Policy and for what problems will you use VQ-BeT?
Q5: What are some limitations of Diffusion Policy? How can they be mitigated?
Mar 18: Spring Break (no class)
Enjoy
Mar 20: Spring Break (no class)
Enjoy
Mar 25: Imitation Learning
3D diffuser actor: Policy diffusion with 3D scene representations. Tsung-Wei Ke, Nikolaos Gkanatsios, Katerina Fragkiadaki. Conference on Robot Learning (CoRL), 2024.
Presentation Questions:
Q1: Describe how 3D Diffuser Actor works (i.e. proposed architecture, output representation, etc)?
Q2: How do RVT / RTV2 tackle this problem, and how does 3D Diffuser Actor compare to RVT / RVT2?
Q3: How well does 3D Diffuser Actor work? What aspect of 3D Diffuser Actor, as discussed in the paper or in your opinion, contribute most to its overall performance?
Q4: What additional experiments would you have liked to see and why? What are some limitations of 3D Diffuser Actor and how could they be mitigated?
Q5: For real world imitation learning, what tasks and settings will 3D diffuser actor be particularly good for? What tasks and settings will be particularly challenging for 3D Diffuser Actor?
Mar 27: Scaling up Imitation Learning
Universal manipulation interface: In-the-wild robot teaching without in-the-wild robots. Cheng Chi, Zhenjia Xu, Chuer Pan, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, Russ Tedrake, Shuran Song. Robotics: Science and Systems, 2024.
Presentation Questions:
Q1: Describe the motivation and design for the UMI device.
Q2: Describe the considerations made while designing policies for use with the UMI device.
Q3: Discuss the experiments that validate the different design choices made in UMI.
Q4: Describe the ALOHA setup? How does UMI compare to Aloha? For what taks will you use UMI and for what problems will you use Aloha?
Q5: What are some limitations of UMI? What are some tasks for which UMI will not be appropriate?
Apr 1: Scaling up Imitation Learning
RDT-1B: A diffusion foundation model for bimanual manipulation. Songming Liu, Lingxuan Wu, Bangguo Li, Hengkai Tan, Huayu Chen, Zhengyi Wang, Ke Xu, Hang Su, Jun Zhu. ICLR, 2025.
Apr 3: Sim2Real for Locomotion
RMA: Rapid motor adaptation for legged robots. Ashish Kumar, Zipeng Fu, Deepak Pathak, Jitendra Malik. Robotics: Science and Systems, 2021.
Apr 8: Humanoids
OmniH2O: Universal and Dexterous Human-to-Humanoid Whole-Body Teleoperation and Learning. Tairan He, Zhengyi Luo, Xialin He, Wenli Xiao, Chong Zhang, Weinan Zhang, Kris Kitani, Changliu Liu, Guanya Shi. Conference on Robot Learning, 2024.
Apr 10: Sim2Real for Dexterous Manipulation
Twisting lids off with two hands. Toru Lin, Zhao-Heng Yin, Haozhi Qi, Pieter Abbeel, Jitendra Malik. CoRL, 2024.
Apr 15: Navigation
Navigating to objects in the real world. Theophile Gervet, Soumith Chintala, Dhruv Batra, Jitendra Malik, Devendra Chaplot. Science Robotics, 2023.
Apr 17: Robotics and Vision and Language Models
Does Spatial Cognition Emerge in Frontier Models?. Santhosh Ramakrishnan, Erik Wijmans, Philipp Kraehenbuehl, Vladlen Koltun. ICLR, 2025.
Apr 22: If DARPA Robotics Challenge were to be held today?
The DARPA Robotics Challenge Finals: Results and Perspectives. Eric Krotkov, Douglas Hackett, Larry Jackel, Michael Perschbacher, James Pippine, Jesse Strauss, Gill Pratt, Christopher Orlowski. Journal of Field Robotics, 2017.