Reinforcement Learning Is Hitting Its Inflection Point

I recently started using the Apple Vision Pro (kind of late, right?) and was amazed at how well they nailed the eye-tracking. The AVP’s eye-tracking calibration is performed against black, gray, and white backgrounds, and I happen to know from my previous life (i.e., during my PhD) that pupil size changes (including due to lighting) affect eye-tracking accuracy.

Eye-tracking technology has matured, and I’m witnessing RL start to mature as well. I missed the eye-tracking train, but I won’t miss the RL one. I’m eager to apply RL to robot dexterity!

The Tragedy and Triumph of RL

Joseph Suarez’s post The Tragedy of Reinforcement Learning articulates both the challenges and the path forward. We’re at an inflection point where the infrastructure and libraries are finally mature enough to enable serious research without massive compute budgets.

The parallel to my eye-tracking experience is clear: there’s a period where technology is “almost there” but not quite practical, followed by a tipping point where it becomes robust and widely adopted. RL is reaching that tipping point now, and tools like PufferLib are making it accessible.

I’m particularly excited about applying these techniques to robot dexterity, where the combination of mature RL methods and increasingly available robot hardware creates new possibilities.