Schedule (tentative)

Here is a tentative syllabus for the course. Paper list will be finalized by the second third week of class. There may be minor changes in the schedule depending on the pace of the class. In the meanwhile, you can see last year's paper list to get a sense of what this course covers.

Recommended Textbooks

  1. RL: Reinforcement Learning: An Introduction. Richard S. Sutton and Andrew G. Barto. Second Edition, MIT Press, Cambridge, MA, 2018. Available online.

  2. CV: Computer Vision: Algorithms and Applications 2nd Edition. Richard Szeliski, Microsoft Research. Available online.

  3. RKH: Robotic Systems. Kris Hauser. Draft available online.

  4. MR: Modern Robotics: Mechanics, Planning, and Control. Frank C. Park, Kevin M. Lynch. Cambridge University Press. Available Online.

Date Topic Slides Readings Assignments
Aug 23 Introduction key, pdf  
 
Part I Review
Aug 25 Computer Vision Review geom.pdf, geom.pptx CV Chapters 7, 8, 11  
 
Aug 30 Computer Vision Review cnn.pdf, cnn.pptx CV Chapters 5, 6 Assignment 1 Released
Aug 30, 2022
Sep 1 Robotics Review pdf RKH Chapters 5, 6, 10, 17  
 
Sep 6 Robotics Review pdf RKH Chapters 5, 6, 10, 17  
 
Sep 8 MDP Review slides, notes RL Chapter 3, 4  
 
Sep 13 MDP Review [Dynamic Programming] notes RL Chapter 3, 4 Assignment 1 Due
Sept 13, 2022 11:59 PM
Sep 15 MDP Review [Monte Carlo, TD,
Model Free Control]
notes, slides RL Chapter 5, 6
DQN
Assignment 2 Released
Sept 15, 2022
Sep 20 Q-Learning key, pdf Rainbow DQN
DDPG
 
 
Sep 22 Offline Q-Learning key, pdf BCQ 
 
Projects
Sep 27 Policy Gradients notes
pdf, key
PPO  
 
Part II Alternatives to Solving Unknown MDPs
Sep 29 Model Building key, pdf ME-TRPO
PETS
Assignment 2 Due
Sept 29, 2022 11:59 PM
Oct 4 Imitation Learning key, pdf DAgger Assignment 3 Released
Oct 4, 2022
Oct 6 Inverse RL key, pdf Inverse RL,
GAIL
Project Proposal due
Oct 6, 2022 11:59 PM
Quiz 1
Week of Oct 10 2022
Oct 11 Self-supervision key, pdf Self-supervised Grasping  
 
Oct 13 Exploration Never Give Up  
 
Oct 18 Sim to Real key, pdf Rapid Motor Adaptation Assignment 3 Due
Oct 18, 2022 11:59 PM
Oct 20 Social Learning pdf, key Navigation Subroutines
Grasping with Hand Pose Priors
 
 
Oct 25 Social Learning WHIRL  
 
Oct 27 Hierarchies key, pdf Feudal Networks for HRL  
 
Part III Case Studies
Nov 1 Navigation key, pdf1
pdf2
Neural Topological SLAM  
 
Nov 3 Manipulation key, pdf
DexNet 2.0
kPAM
Progress Report due
Nov 3, 2022 11:59 PM
Nov 8 Election Day (No class)  
 
Nov 10 Manipulation Re-grasping using Touch
 
 
Quiz 2
Week of Nov 14 2022
Nov 15 Best Practices in RL  
 
Deep RL that Matters
Part IV Perspectives
Nov 17 Lessons from Cognitive Science
and Psychology
6 Lessons
Nov 22 Fall Break (No class)  
 
Nov 24 Fall Break (No class)  
 
Nov 29 Data vs Algorithms The Bitter Lesson+A Better Lesson  
 
Dec 1 TBD  
 
Projects Due
Dec 5, 2022 11:59 PM
Dec 6 Project Presentations  
 

PILCO: A model-based and data-efficient approach to policy search
Marc Deisenroth and Carl Rasmussen
ICML 2011

Deep visual foresight for planning robot motion
Chelsea Finn and Sergey Levine
ICRA 2017

Deep reinforcement learning in a handful of trials using probabilistic dynamics models
Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
NeurIPS 2018

Benchmarking model-based reinforcement learning
Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba
arXiv preprint arXiv:1907.02057 2019

A reduction of imitation learning and structured prediction to no-regret online learning
Stephane Ross, Geoffrey Gordon, Drew Bagnell
AISTATS 2011

End-to-end training of deep visuomotor policies
Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
JMLR 2016

Maximum entropy inverse reinforcement learning.
Brian Ziebart, Andrew Maas, J Bagnell, Anind Dey
AAAI 2008

Apprenticeship learning via inverse reinforcement learning
Pieter Abbeel and Andrew Ng
ICML 2004

The development of embodied cognition: Six lessons from babies
Linda Smith and Michael Gasser
Artificial life 2005

Building machines that learn and think like people
Brenden Lake, Tomer Ullman, Joshua Tenenbaum, Samuel Gershman
Behavioral and brain sciences 2017

Playing atari with deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
arXiv preprint arXiv:1312.5602 2013

TBD
TBD TBD
TBD 0

Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation
Yuhuai Wu, Elman Mansimov, Roger Grosse, Shun Liao, Jimmy Ba
NeurIPS 2017

Continuous control with deep reinforcement learning
Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra
ICLR 2016

Off-policy deep reinforcement learning without exploration
Scott Fujimoto, David Meger, Doina Precup
ICML 2019

Learning agile and dynamic motor skills for legged robots
Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, Marco Hutter
Science Robotics 2019

Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours
Lerrel Pinto and Abhinav Gupta
ICRA 2016

Learning synergies between pushing and grasping with self-supervised deep reinforcement learning
Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser
IROS 2018

Planning to Explore via Self-Supervised World Models
Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak
ICML 2020

Curiosity-driven exploration by self-supervised prediction
Deepak Pathak, Pulkit Agrawal, Alexei Efros, Trevor Darrell
ICML 2017

Grasping in the wild: Learning 6dof closed-loop grasping from low-cost demonstrations
Shuran Song, Andy Zeng, Johnny Lee, Thomas Funkhouser
RAL 2020

Learning Navigation Subroutines from Egocentric Videos
Ashish Kumar, Saurabh Gupta, Jitendra Malik
CoRL 2019

Differentiable MPC for end-to-end planning and control
Brandon Amos, Ivan Jimenez, Jacob Sacks, Byron Boots, J Kolter
NeurIPS 2018

Differntiable Spatial Planning using Transformers
Anonymous Anonymous
Open Review 2020

Neural Topological SLAM for Visual Navigation
Devendra Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta
CVPR 2020

More than a feeling: Learning to grasp and regrasp using vision and touch
Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward Adelson, Sergey Levine
RAL 2018

Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics
Jeffrey Mahler, Jacky Liang, Sherdil Niyaz, Michael Laskey, Richard Doan, Xinyu Liu, Juan Ojea, Ken Goldberg
RSS 2017

Semantic Visual Navigation by Watching YouTube Videos
Matthew Chang, Arjun Gupta, Saurabh Gupta
NeurIPS 2020

Algorithms for inverse reinforcement learning.
Andrew Ng and Stuart Russell
ICML 2000

Feudal reinforcement learning
Peter Dayan and Geoffrey Hinton
NeurIPS 1993

Unsupervised perceptual rewards for imitation learning
Pierre Sermanet, Kelvin Xu, Sergey Levine
Robotics: Science and Systems 2017

Agile autonomous driving using end-to-end deep imitation learning
Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron Boots
Robotics: Science and Systems 2018

Cognitive mapping and planning for visual navigation
Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik
International Journal of Computer Vision 2019

Learning to Compose Hierarchical Object-Centric Controllers for Robotic Manipulation
Mohit Sharma, Jacky Liang, Jialiang Zhao, Alex LaGrassa, Oliver Kroemer
CoRL 2020

The Bitter Lesson
Rich Sutton
Blogpost 2019

A Better Lesson
Rodney Brooks
Blogpost 2019

RMA: Rapid motor adaptation for legged robots
Ashish Kumar, Zipeng Fu, Deepak Pathak, Jitendra Malik
Robotics: Science and Systems 2021

Proximal policy optimization algorithms
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov
arXiv preprint arXiv:1707.06347 2017

Rainbow: Combining improvements in deep reinforcement learning
Matteo Hessel, Joseph Modayil, Hado Van, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver
AAAI 2018

Model-ensemble trust-region policy optimization
Thanard Kurutach, Ignasi Clavera, Yan Duan, Aviv Tamar, Pieter Abbeel
ICLR 2018

Generative adversarial imitation learning
Jonathan Ho and Stefano Ermon
NeurIPS 2016

Implicit Behavioral Cloning
Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson
CoRL 2021

Differentiable Spatial Planning using Transformers
Devendra Chaplot, Deepak Pathak, Jitendra Malik
ICML 2021

Feudal networks for hierarchical reinforcement learning
Alexander Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu
ICML 2017

Optnet: Differentiable optimization as a layer in neural networks
Brandon Amos and J Kolter
ICML 2017

Learning Generalizable Robotic Reward Functions from“ In-The-Wild” Human Videos
Annie Chen, Suraj Nair, Chelsea Finn
RSS 2021

Relmogen: Leveraging motion generation in reinforcement learning for mobile manipulation
Fei Xia, Chengshu Li, Roberto Mart{'i}n-Mart{'i}n, Or Litany, Alexander Toshev, Silvio Savarese
ICRA 2021

Never give up: Learning directed exploration strategies
Adri{`a} Badia, Pablo Sprechmann, Alex Vitvitskyi, Daniel Guo, Bilal Piot, Steven Kapturowski, Olivier Tieleman, Mart{'i}n Arjovsky, Alexander Pritzel, Andew Bolt, Charles Blundell
ICLR 2020

Human-to-Robot Imitation in the Wild
Shikhar Bahl, Abhinav Gupta, Deepak Pathak
RSS 2022

Dexvip: Learning dexterous grasping with human hand pose priors from video
Priyanka Mandikal and Kristen Grauman
CoRL 2021

kPAM: Keypoint affordances for category-level robotic manipulation
Lucas Manuelli, Wei Gao, Peter Florence, Russ Tedrake
ISRR 2019

Deep reinforcement learning that matters
P Henderson, R Islam, P Bachman, J Pineau, D Precup, D Meger
AAAI 2018