Schedule (tentative)
Here is a tentative syllabus for the course. Readings will be filled in over time.
Recommended Textbooks
RL: Reinforcement Learning: An Introduction. Richard S. Sutton and Andrew G. Barto. Second Edition, MIT Press, Cambridge, MA, 2018. Available online.
CV: Computer Vision: Algorithms and Applications. Richard Szeliski, Microsoft Research. Available online.
RKH: Robotic Systems. Kris Hauser. Draft available online.
MR: Modern Robotics: Mechanics, Planning, and Control. Frank C. Park, Kevin M. Lynch. Cambridge University Press. Available Online.
Date | Topic | Slides | Readings | Assignments |
Aug 25 | Introduction | pdf | | |
Part I | Review | | | |
Aug 27 | Computer Vision Review | pdf, pptx | CV Chapters 4, 7, 14 | |
Sep 1 | Computer Vision Review | pdf, pptx | | Assignment 1 |
Sep 3 | Robotics Review | pdf | RKH Chapters 5, 6, 10, 17 | |
Sep 8 | MDP Review [Dynamic Programming] | pdf | RL Chapters 3, 4 | |
Sep 10 | MDP Review [Monte Carlo, TD] | David Silver's Slides | RL Chapters 5, 6 | |
Sep 15 | MDP Review [Model Free Control] | David Silver's Slides | RL Chapters 5, 6. DQN | Assignment 1 Due Sep 15, 2020 at 11:59:59 PM |
Sep 17 | Q-Learning | pdf, key | DDPG BCQ | |
Sep 22 | Policy Gradients | David Silver's Slides | ACKTR | Assignment 2 |
Part II | Alternatives to Solving Unknown MDPs | | | |
Sep 24 | Model Building | pdf, key | PILCO PETS | Projects |
Sep 29 | Model Building | pdf, key | Deep Visual Foresight Benchmarking MBRL | |
Oct 1 | Imitation Learning | pdf, key | DAgger E2E Visuomotor Policies | |
Oct 6 | Inverse RL | pdf, key | Inverse RL Apprenticeship Learning | Assignment 2 Due Oct 6, 2020 at 11:59:59 PM |
Oct 8 | Inverse RL | pdf, key | MaxEnt IRL | Project Proposals Due Oct 8, 2020 at 11:59:59 PM Informal Early Feedback |
Oct 13 | Self-supervision | pdf, key | Self-supervised Grasping Self-supervised Pushing and Grasping | Assignment 3 |
Oct 15 | Exploration | notes | Curiosity Planning to Explore | |
Oct 20 | Sim to Real | pdf, key | ANYmal | |
Oct 22 | Hierarchies | pdf, key | Feudal RL | |
Oct 27 | Social Learning | pdf, key | Grasping in the wild Navigation Subroutines | Assignment 3 Due Oct 28, 2020 at 11:59:59 PM |
Oct 29 | Social Learning | pdf, key pdf, key | Value Learning from Videos Perceptual Rewards | |
Nov 3 | Election Day (No class) | | | |
Nov 5 | Differentiable Planners | pdf, key | Differentiable MPC | Project Progress Report Due Nov 5, 2020 at 11:59:59 PM |
Part III | Case Studies | | | |
Nov 10 | Navigation | pdf, key | Agile Autonomous Driving | |
Nov 12 | Navigation | pdf, key | CMP Neural Topological SLAM | |
Nov 17 | Manipulation | pdf, key | Dex-net 2.0 | |
Nov 19 | Manipulation | pdf, key | Re-grasping using Touch Hierarchical Object-Centric Controllers | |
Nov 24 | Fall Break (No class) | | | |
Nov 26 | Fall Break (No class) | | | |
Part IV | Perspectives | | | |
Dec 1 | Lessons from Cognitive Science and Psychology | pdf, pptx | 6 Lessons Lake et al. 2017 | |
Dec 3 | Data vs Algorithms | | The Bitter Lesson+A Better Lesson | ICES |
| | | | Project Final Report Due Dec 6, 2020 at 11:59:59 PM |
Dec 8 | Project Presentations | | |
|
PILCO: A model-based and data-efficient approach to policy search
Marc Deisenroth and Carl Rasmussen
ICML 2011
Deep visual foresight for planning robot motion
Chelsea Finn and Sergey Levine
ICRA 2017
Deep reinforcement learning in a handful of trials using probabilistic dynamics models
Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine
NeurIPS 2018
Benchmarking model-based reinforcement learning
Eric Langlois, Shunshi Zhang, Guodong Zhang, Pieter Abbeel, Jimmy Ba
arXiv preprint arXiv:1907.02057 2019
A reduction of imitation learning and structured prediction to no-regret online learning
Stephane Ross, Geoffrey Gordon, Drew Bagnell
AISTATS 2011
End-to-end training of deep visuomotor policies
Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel
JMLR 2016
Maximum entropy inverse reinforcement learning.
Brian Ziebart, Andrew Maas, J Bagnell, Anind Dey
AAAI 2008
Apprenticeship learning via inverse reinforcement learning
Pieter Abbeel and Andrew Ng
ICML 2004
The development of embodied cognition: Six lessons from babies
Linda Smith and Michael Gasser
Artificial life 2005
Building machines that learn and think like people
Brenden Lake, Tomer Ullman, Joshua Tenenbaum, Samuel Gershman
Behavioral and brain sciences 2017
Playing atari with deep reinforcement learning
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
arXiv preprint arXiv:1312.5602 2013
Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation
Yuhuai Wu, Elman Mansimov, Roger Grosse, Shun Liao, Jimmy Ba
NeurIPS 2017
Continuous control with deep reinforcement learning
Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra
ICLR 2016
Off-policy deep reinforcement learning without exploration
Scott Fujimoto, David Meger, Doina Precup
ICML 2019
Learning agile and dynamic motor skills for legged robots
Jemin Hwangbo, Joonho Lee, Alexey Dosovitskiy, Dario Bellicoso, Vassilios Tsounis, Vladlen Koltun, Marco Hutter
Science Robotics 2019
Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours
Lerrel Pinto and Abhinav Gupta
ICRA 2016
Learning synergies between pushing and grasping with self-supervised deep reinforcement learning
Andy Zeng, Shuran Song, Stefan Welker, Johnny Lee, Alberto Rodriguez, Thomas Funkhouser
IROS 2018
Planning to Explore via Self-Supervised World Models
Ramanan Sekar, Oleh Rybkin, Kostas Daniilidis, Pieter Abbeel, Danijar Hafner, Deepak Pathak
ICML 2020
Curiosity-driven exploration by self-supervised prediction
Deepak Pathak, Pulkit Agrawal, Alexei Efros, Trevor Darrell
ICML 2017
Grasping in the wild: Learning 6dof closed-loop grasping from low-cost demonstrations
Shuran Song, Andy Zeng, Johnny Lee, Thomas Funkhouser
RAL 2020
Learning Navigation Subroutines from Egocentric Videos
Ashish Kumar, Saurabh Gupta, Jitendra Malik
CoRL 2019
Differentiable MPC for end-to-end planning and control
Brandon Amos, Ivan Jimenez, Jacob Sacks, Byron Boots, J Kolter
NeurIPS 2018
Differntiable Spatial Planning using Transformers
Anonymous Anonymous
Open Review 2020
Neural Topological SLAM for Visual Navigation
Devendra Chaplot, Ruslan Salakhutdinov, Abhinav Gupta, Saurabh Gupta
CVPR 2020
More than a feeling: Learning to grasp and regrasp using vision and touch
Roberto Calandra, Andrew Owens, Dinesh Jayaraman, Justin Lin, Wenzhen Yuan, Jitendra Malik, Edward Adelson, Sergey Levine
RAL 2018
Dex-net 2.0: Deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics
Jeffrey Mahler, Jacky Liang, Sherdil Niyaz, Michael Laskey, Richard Doan, Xinyu Liu, Juan Ojea, Ken Goldberg
RSS 2017
Semantic Visual Navigation by Watching YouTube Videos
Matthew Chang, Arjun Gupta, Saurabh Gupta
NeurIPS 2020
Algorithms for inverse reinforcement learning.
Andrew Ng and Stuart Russell
ICML 2000
Feudal reinforcement learning
Peter Dayan and Geoffrey Hinton
NeurIPS 1993
Unsupervised perceptual rewards for imitation learning
Pierre Sermanet, Kelvin Xu, Sergey Levine
Robotics: Science and Systems 2017
Agile autonomous driving using end-to-end deep imitation learning
Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron Boots
Robotics: Science and Systems 2018
Cognitive mapping and planning for visual navigation
Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik
International Journal of Computer Vision 2019
Learning to Compose Hierarchical Object-Centric Controllers for Robotic Manipulation
Mohit Sharma, Jacky Liang, Jialiang Zhao, Alex LaGrassa, Oliver Kroemer
CoRL 2020
The Bitter Lesson
Rich Sutton
Blogpost 2019
A Better Lesson
Rodney Brooks
Blogpost 2019
|