TacCoRL: Integrating Tactile Feedback into VLA via Simulation (original) (raw)

Left shows the held-out J4 replay, where gravity compensation caused the largest calibration gap; after SysID, simulation matches real tracking. Right shows aligned tactile distributions from matched rollouts, supporting simulator post-training.

Across all tasks, visuo-tactile policies consistently achieve higher success rates than vision-only policies during simulator RL. This indicates that tactile histories remain useful during on-policy refinement, rather than serving solely as an additional imitation-learning input.

Tactile Feedback Improves Simulator RL

We vary the co-training ratio α and real-anchor weight β on the Assembly #2 task and report model performance in terms of simulator success rate, real-data anchor loss, which measures policy deviation from real demonstrations during simulation, and real-world deployment success rate.

Assembly #2 ablation of co-training ratio and real-data anchor weight, showing simulation success, anchor loss, and real-world deployment success.

Ablation of Co-Training and Real-Data Anchoring

Direct RL from the base VLA model fails across all tasks. Sim-real co-training provides a strong initialization and substantially improves the success rate for simulator-based RL. Incorporating tactile feedback further improves policy performance.

Settings	Vision-Only	Visuo-Tactile
Tube	Puzzle	Asm. #1	Asm. #2	Tube	Puzzle	Asm. #1	Asm. #2
After Co-Training	42%	12%	39%	39%	41%	16%	47%	58%
RL Start with Exploration Noise	35%	5%	25%	25%	36%	4%	35%	42%
RL from Base VLA	0%	0%	0%	0%	0%	0%	0%	0%
RL with Co-Training	50%	54%	77%	61%	72%	71%	92%	79%

Simulation Success Rates

Rows compare training stages; columns compare vision-only and visuo-tactile policies. Tactile feedback and RL post-training together achieve the highest real-world success rates.

Training Stage	Vision-Only	Visuo-Tactile
Tube	Puzzle	Asm. #1	Asm. #2	Tube	Puzzle	Asm. #1	Asm. #2
Real-Only Fine-Tuning	20%	5%	35%	25%	45%	15%	35%	40%
Sim-Real Co-Training	35%	10%	40%	35%	50%	25%	45%	55%
RL Post-Training	35%	25%	80%	60%	70%	45%	95%	80%

TacCoRL: Integrating Tactile Feedback into VLA via Simulation (original) (raw)

Tactile Feedback Improves Simulator RL

Ablation of Co-Training and Real-Data Anchoring

Simulation Success Rates

Real-World Success Rates