Reflections from Actuate 2024
Reflections from the 2024 Actuate Robotics & Embodied AI Conference:
Key Questions
Two major themes emerged for me:
- Unified perception-to-action model vs. specialized models collaborating? Then, interpretability?
- How much can we change the environment that robots operate in?
Interpretability Trade-offs
Specialized models can provide good interpretability. For example, you can tell that “rain messed up object detection” when you have a dedicated perception module.
But vision-language-action models (like LINGO-1 from Wayve) that can explain their actions were also very impressive. These unified models can provide natural language explanations for their decisions, which is a different kind of interpretability.
The Modularity Question
This connects to fundamental questions in AI architecture:
- Modular: Better interpretability, easier to debug specific components, can swap out parts
- End-to-end: Better optimization for the overall task, natural language explanations, simpler deployment
We might need both approaches for different use cases.
Related Reading
Actuate 2024 recap from Foxglove provides more details on the conference.