Sergey Levine, UC Berkeley: On the bottlenecks to generalization, why simulation is doomed to succeed, and picking good research problems

6 min read

Last updated 15 Jun 2026

Kanjun Qiu

CEO, Co-founder

Josh Albrecht

CTO, Co-founder

Some highlights from our conversation
Referenced in this podcast

RSS · Spotify · Apple Podcasts · Pocket Casts

Sergey Levine, an assistant professor of EECS at UC Berkeley, is one of the pioneers of modern deep reinforcement learning. His research focuses on developing general-purpose algorithms for autonomous agents to learn how to solve any task. In this episode, we talked about the evolution of deep reinforcement learning, how previous robotics approaches were replaced, and why offline RL is significant for future generalization.

Below are some highlights from our conversation as well as links to the papers, people, and groups referenced in the episode.

Some highlights from our conversation

“I do think that, in science, it is a really good idea to sometimes see how extreme a design can still work because you learn a lot from doing that. This is, by the way, something, I get a lot of comments on this. You know, I’ll be talking to people and they’ll be like, ‘Well, we know how to do, like, robotic grasping, and we know how to do inverse kinematics, and we know how to do this and this, so why don’t you use those parts?’ And it’s, yeah, you could, but if you want to understand the utility, the value of some particular new design, it kind of makes sense to really zoom in on that and really isolate it and really just understand its value instead of trying to put in all these crutches to compensate for all the parts where we might have better existing kind of ideas.”

“The thing is, robots, if they are autonomous robots–they should be collecting data way more cheaply in a way larger scale than data we harvest from humans. For this reason, I actually think that robotics in the long run may actually be at a huge advantage in terms of its ability to collect data. We’re just not seeing this huge advantage now in robotic manipulation because we’re stuck at the smaller scale, more due to economics, rather than, I would say, science.”

“We want simplicity because simplicity makes it easy to make things work on a large scale. You know, if your method is simple, there are essentially fewer ways that it could go wrong. I don’t think the problem with clever prompting is that it’s too simple or primitive. I think the problem might actually be, that it might be too complex and that developing a good, effective reinforcement learning or planning method might actually be a simpler, more general solution.”

“I think, in reality, for any practical deployment of these kinds of ideas at scale, it would actually be many robots all collecting data, sharing it, and exchanging their brains over a network and all that. That’s the more scalable way to think about on the learning side. But, I do think that also on the physical side, there’s a lot of practical challenges, and just, you know, what kind of methods should we even have if we want the robot in your home to practice cleaning your dishes for three days. I mean, if you just run a reinforcement learning algorithm for a robot in your home, probably, the first thing it’ll do is wave its arm around, break your window, then break all your dishes, then break itself, and then spend the remaining time it has, just sitting there at the broken corner. So there’s a lot of practicalities in this.”

Referenced in this podcast

Thanks to Tessa Hall for editing the podcast.