Chinese electric vehicle maker Li Auto has introduced its next-generation autonomous driving architecture, MindVLA (Visual-Language-Action), which the company claims is a major step toward achieving fully autonomous driving.
Jia Peng, head of autonomous driving technology at Li Auto, unveiled the new system at the Nvidia GTC 2025 event. The model integrates spatial, linguistic, and behavioral intelligence, allowing autonomous vehicles to perceive, reason, and adapt to their surroundings.
See also: Li Auto Reports In-Line Q4 Revenue as Net Income Rebounds, Gross Margin Declines

“MindVLA is a Visual-Language-Action large model, but we prefer to call it a ‘robot large model,'” Li Xiang, Li Auto’s founder and CEO, said in a Weibo post, comparing its impact on autonomous driving to the iPhone 4’s transformation of the smartphone industry.
MindVLA is designed to enhance self-driving capabilities with human-like decision-making and adaptability. Li Auto developed the base model from scratch using a Mixture of Experts (MoE) architecture and a Sparse Attention mechanism, allowing for model scalability without reducing computational efficiency.
See also: Li Auto’s First All-Electric SUV, Li i8, Nears Launch After 8 Million KM Tests

The system employs diffusion-based action token decoding to generate optimized driving trajectories and integrates real-time predictions of surrounding vehicle behavior, improving navigation in complex traffic environments. Li Auto also developed an in-house simulation environment to train the model with real-world accuracy.
MindVLA is expected to debut alongside the Li i8, the company’s first all-electric SUV, which is set for launch in July.