Figure unveils Helix neural network for cross-task generalization
Figure announced Helix, a generalist vision-language-action neural network enabling Figure 03 to execute cross-task manipulation from visual input without task-specific retraining.
Impact on Autonomy L2 to L3 conditional autonomy potential
- Cross-task manipulation without task-specific retraining
- Potential L3 conditional autonomy in structured environments
- Expanded task scope from single-task to multi-task execution
- Visual reasoning for novel task adaptation
Impact on Readiness Promising Progress unchanged
- Consumer adoption barrier remains high due to cost
- Complexity of deployment requires technical expertise
- Software-only update with no hardware change required
- Long-term readiness depends on real-world reliability data
Helix represents a declared shift from single-task to multi-task control on Figure 03. Capability claims are significant but require independent performance benchmarking and long-term field data to assess real-world generalization limits.
Model Architecture
Helix is a vision-language-action (VLA) neural network designed for end-to-end control on Figure 03. The model processes visual input and language prompts to generate motor commands without intermediate planning layers. Manufacturer documentation does not specify architecture details, training dataset composition, or model size.
Deployment
Software-only update. No hardware revision required for Figure 03 units. Figure indicates the model runs on onboard compute with latency targets suitable for real-time control. Specific compute requirements and power draw not published.
Training and Generalization
Helix is trained on multi-task data to enable cross-task inference. Manufacturer claims demonstrate execution of manipulation tasks from visual input without task-specific fine-tuning. Performance on out-of-distribution tasks not disclosed. Few-shot adaptation capability mentioned but not benchmarked.