RSS · Spotify · Apple Podcasts · Pocket Casts
Some highlights from our conversation
“None of my research is really [about] state-of-the-art. […] The thing that is important to me is that whatever method I come up with, it can do something that prior methods can’t do.”
“If you’re training your agent in a 5-by-5 grid, and then you give it a 10-by-10 grid, it’s never going to generalize. But what if you train the agent on tables in your 5-by-5 grid, right? Like just local tables. Then if I give you a 10-by-10 grid, and I have more tables, you can generalize. So it seems to me like modularity really allows you to generalize in a sense […] even though your global input is completely out-of-distribution, if you process these local modules one by one, it’s much more in-distribution.”
“It feels like we’re not solving the generic robotics problem. You basically train this agent using millions of CPU hours to reorient a single cube in its hand. If I give you a different object, you can’t reorient it. If I put the arm in a different configuration, you can’t reorient it.”
Referenced in this podcast
- Several of Yilun’s papers:
- Learning to See by Looking at Noise by Baradad et al. 2021
- Implicit Neural Representations with Periodic Activation Functions (SIREN) by Sitzmann et al. 2020
Thanks to Tessa Hall for editing the podcast.