Harbin Institute of Technology
Student. Add your program, department, and dates here.
I am Jie He. My current research interests lie in vision-language-action models, embodied intelligence, robotic manipulation, and efficient model adaptation for real-world decision making.
* Equal contribution. † Corresponding author.
arXiv preprint arXiv:2603.08361, 2026.
ΔVLA studies how world knowledge variation can guide vision-language-action models. The work uses prior knowledge to improve action reasoning and adaptation, aiming to make embodied policies more reliable under changing task and scene conditions.
AAAI 2026 (Oral).
H-GAR introduces a hierarchical interaction framework for robotic manipulation. It refines observations and actions according to task goals, improving the robot's ability to reason over long-horizon interactions and execute manipulation steps more precisely.
NeurIPS 2025.
CogVLA aligns vision-language-action models with cognitive execution patterns through instruction-driven routing and sparsification. It targets efficient, task-aware reasoning so the model can activate the most relevant pathways for embodied decision making.
Student. Add your program, department, and dates here.