Deformable manipulation with X1D
X1D includes the model, runtime, data, and evaluation interfaces required to begin developing deformable-object manipulation capabilities.
This does not mean that current X1D checkpoints can reliably fold garments. A deformable-manipulation checkpoint, suitable training data, and real-hardware validation are still required.
What is available
Dual-arm runtime contract
Xolver provides an experimental 14-dimensional ALOHA-style dual-arm contract defining:
- Left and right arm joint actions
- Two normalized gripper actions
- Multi-step trajectory chunks
- Robot proprioception
- One workspace camera
- Left and right wrist cameras
- Execution timing
- Inter-arm collision requirements
- Joint, velocity, acceleration, and workspace limits
- Mandatory human-supervised operation
The included limits are conservative placeholders. They must be replaced with hardware-specific values and validated before physical actuation.
Committed-action training
Asynchronous execution means a robot may already be performing part of an approved action chunk while X1D generates the next one.
Committed-action training allows a batch to identify those movements using:
extra_modalities["committed_action_mask"]Expected shape:
[batch, action_horizon]Committed actions remain unchanged and are provided as conditioning context. The training loss is applied only to the uncommitted future portion of the action chunk.
Enable it with:
enable_committed_action_training: trueThe feature is disabled by default.
Corrective demonstrations
Xolver provides a structured record for human teaching interventions. A corrective demonstration includes:
- Episode and task identifiers
- Operator identity
- Intervention interval
- Observation references
- Original policy actions
- Human expert actions
- Reason for intervention
- Deployment evidence
- Training approval state
- Reviewer identity
Corrective demonstrations are separate from Safety Shield interventions.
A Safety Shield intervention indicates that an action was blocked or modified. It does not necessarily provide the correct expert action and must not automatically become a training label.
A correction cannot be approved for training without a reviewer.
Deformable-task evaluation
The evaluation interface records:
- Full-task success
- Human-intervention rate
- Safety-intervention rate
- Recovery attempts
- Recovery success
- Completion time
- Whether the object left the validated workspace
These metrics evaluate task outcomes and operational supervision—not merely whether the model generated motion.
Intended development path
The recommended first task is a narrow, measurable activity such as towel spreading and folding.
Simulation demonstrations
→ randomized evaluation
→ supervised robot rollout
→ human correction
→ review
→ retraining
→ comparative evaluationTraining should gradually cover:
- Different starting configurations
- Fabric sizes and materials
- Corner and edge grasping
- Cloth spreading and alignment
- Missed grasps
- Slippage
- Incorrect intermediate folds
- Recovery from partially completed tasks
Safety requirements
Experimental deformable manipulation requires:
- Hardware-specific dual-arm calibration
- Accurate collision geometry
- Inter-arm collision enforcement
- Validated workspace boundaries
- Joint, velocity, and acceleration limits
- Human takeover capability
- Watchdog enforcement
- Replay and intervention recording
- Supervised commissioning
No experimental contract should be treated as production certification.
Current limitations
The current implementation does not include:
- A trained deformable-manipulation checkpoint
- General garment-folding capability
- Production-certified dual-arm limits
- A complete cloth simulator
- Public performance benchmarks
- Evidence of reliable real-world folding