Below are some highlights from our conversation as well as links to the papers, people, and groups referenced in the episode.
Some highlights from our conversation
“There are also some really obvious limitations. Like we told it what the reward function is, and we gave it a very nicely shaped reward function saying ‘you’ve gotten a little bit closer,’ and that’s something that you don’t get in the real world, the real world doesn’t tell you how well you’re doing at a certain task. So that was one obvious limitation. And another thing was that a lot of the tasks we would have trial and error where the robot would try the task and then we would put the robot back into the previous scene and then it would try again and oftentimes I would be kind of resetting the scene after every trial. And that’s also something that’s not really scalable if you want robots to leverage large amounts of data. And then the last thing was that the robot learned a cool skill, but it learned something very specific to the objects that it was seeing in the scene that it was in. And ultimately, if we wanna put robots into the world, we can’t have them just work for one scene and one object.”
“The train distribution and the test distribution are not always the same. The real world changes over time. This is true in robotic settings because you train it in a lab maybe and then want to deploy it in the world or it visits a new part of the world. But it’s also true in a whole range of other machine learning applications as well. And it’s kind of a huge problem.”
“One of the things that we found, even in multi-task learning settings, is how different two tasks or two distributions are not only depends on the data itself, but could also depend on the model and how it learned. […] If it happened to learn something for one task that actually worked well for another task, then they’re close. And if it happened to learn something different they’re far apart. […] Thinking about these distances between tasks or distributions, we can’t decouple it from the model and what’s being learned.”
“I’m a big believer in trying to use as much real data as we can. We’ve seen a lot of success using a lot of real data in the rest of machine learning. So I think we should try to do the same in robotics if we can.”
“I don’t think language models will help us solve robotics. Like we don’t talk about how to tie your shoe in general conversation, there’s no Wikipedia article that details low level motor skills. […] I think that actually the low level motor control is a huge bottleneck and I don’t think language is gonna help.”
Referenced in this podcast
- ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback
- WILDS: A Benchmark of in-the-Wild Distribution Shifts and the more recent Wild-Time: A Benchmark of in-the-Wild Distribution Shift over Time
- Rethinking Sim2Real: Lower Fidelity Simulation Leads to Higher Sim2Real Transfer in Navigation
- OpenAI’s Solving Rubik’s Cube with a Robot Hand (and the Shadow Hand)
- Do As I Can, Not As I Say: Grounding Language in Robotic Affordances (SayCan)
- Chelsea Finn’s most highly cited papers End-to-End Training of Deep Visuomotor Policies and MAML
Thanks to Tessa Hall for editing the podcast.