Why AI Needs a New Programming Language: A Conversation with Chris Lattner

The creator of LLVM and Swift discusses the structural problems in AI compute and how his new language, Mojo, is designed to solve them.

Key Takeaways:

The AI hardware ecosystem is fragmented, with each vendor building its own incompatible software stack (e.g., CUDA, ROCm, XLA).
Existing languages like C++ and embedded Python DSLs struggle to deliver both top performance and hardware portability.
Mojo is a new programming language designed from first principles to unify this "heterogeneous compute" landscape.
It combines Python's syntax and usability with powerful meta-programming and a strong type system for predictable, high-performance code.
The goal is not to sell a language, but to build a foundational platform that enables true portability across CPUs, GPUs, and other accelerators.

The AI Compute Fragmentation Problem

The current state of AI development is defined by a fundamental structural problem. The most advanced work in low-level code generation for new hardware (GPUs, TPUs, ASICs) is done by the hardware companies themselves—NVIDIA, AMD, Google, etc. Their primary incentive is to optimize for their own specific hardware roadmaps, not to create a unified, portable software ecosystem.

This has resulted in a fragmented mess of incompatible software stacks: CUDA, ROCm, XLA, and countless others. High-level frameworks like PyTorch are forced to act as complex abstraction layers, desperately trying to sew these disparate systems together. This creates a "leaky abstraction," where developers are inevitably forced to debug deep, unfamiliar stacks to achieve performance or fix problems.

The core issue is that there is no equivalent of LLVM for the AI accelerator world—no open, portable compiler infrastructure that all vendors can build upon. This lack of a common foundation forces everyone to reinvent the wheel, locking software to specific hardware and stifling innovation.

Why Not Use C++, Python, or Existing DSLs?

Given this problem, why create an entirely new language? Existing options all have significant drawbacks for this domain:

C++: Offers control but is notoriously unsafe and difficult. Its template meta-programming is powerful but results in incomprehensible error messages and crushing compile times. It was never designed for modern accelerator architectures like tensor cores.
Embedded Python DSLs (e.g., in PyTorch/Numba): Provide ease of use but often rely on "magic" – a sufficiently smart compiler that works well in demos but breaks unpredictably in real-world, complex scenarios. They frequently sacrifice top-tier hardware performance.
Vendor-Specific DSLs (e.g., Triton): Make GPU programming easier but can concede up to 20% of performance compared to hand-written CUDA. The goal for Mojo is to deliver 100% of the hardware's capability.

Furthermore, the hardware itself is evolving at a breakneck pace. New matrix multiplication units, varied warp sizes, and exotic data types (like 4-bit or 1.2-bit floating point) are constantly emerging. Most general-purpose languages aren't even aware these concepts exist, making them a poor fit for expressing these new paradigms.

Mojo: A Pythonic Language for Systems Programming

Mojo is designed to bridge this gap. Its first principle is to be a member of the Python family. It embraces Python syntax and ethos to ensure it feels familiar to the vast AI/ML community that already thrives on Python. This drastically reduces the barrier to entry compared to a language with a completely new syntax.

However, it diverges from Python where necessary to provide the performance and control required for systems programming:

A Strong, Expressive Type System: Mojo incorporates features like traits (similar to Rust traits or Swift protocols) for ad-hoc polymorphism. This allows for writing generic, reusable code that is checked at compile time, leading to better errors, faster compiles, and safer abstractions than C++ templates.
Powerful Meta-programming: Mojo unifies runtime and compile-time execution. Programmers can use the same language and logic for both, allowing them to specialize algorithms for different data types, hardware targets, or matrix layouts at compile time with zero runtime overhead. This is key to writing a single kernel that can be optimized for both an AMD and an NVIDIA GPU.
Control and Predictability: Unlike "magic" compilers that try to guess the developer's intent, Mojo aims to give programmers explicit control over the hardware. This ensures predictable performance and makes it possible to achieve the full potential of the silicon.

The Modular Stack: More Than Just a Language

Mojo is the core of a larger technology stack being built by Modular to solve the AI infrastructure problem:

Mojo: The programming language for writing portable, high-performance kernels.
MAX: A high-performance inference engine (a PyTorch/TensorFlow competitor) built with Mojo.
Mammoth: A cluster management layer for orchestrating workloads across thousands of accelerators.

The business model isn't to sell Mojo itself, but to offer an enterprise-grade platform that simplifies the immense complexity of managing large-scale AI inference and training workloads across diverse, ever-changing hardware.

A Pragmatic Path Forward

Lattner emphasizes a pragmatic, utility-driven approach to language design. The goal isn't academic novelty but solving real engineering problems. Mojo is being built with a keen awareness of the pitfalls of language design, such as featurism and unnecessary complexity.

In the near term, Mojo's primary value is as a systems language for performance-critical code. It is positioned as the best way to extend Python: developers can seamlessly move performance bottlenecks from Python into Mojo files within the same project, gaining massive speedups on CPUs and GPUs without the hassle of C++ bindings or a complex FFI.

Looking ahead, the roadmap includes adding more features like classes to make it feel even more Pythonic, with the long-term vision of Mojo evolving into a top-to-bottom applications language and a potential successor to Python for high-performance computing, while always maintaining deep interoperability with the existing Python ecosystem.

id: 0199159c074377f7b2f6c5103dce8a2a