Aimed at solving the dual language problem that you have with Python + wheels (debugging, performance optimization)

Relationship with LLVM

Based on the idea from Chris Lattner is responsible for creating many of the projects that we all rely on today – even although we might not even have heard of all the stuff he built! As part of his PhD thesis he started the development of LLVM, which fundamentally changed how compilers are created, and today forms the foundation of many of the most widely used language ecosystems in the world.

Chris Lattner tried to leverage the full power of LLVM at Apple designing a new language, Swift, which has been extremely successful. Unfortunately, under Apple’s control of Swift has meant it hasn’t really had its time to shine outside of the cloistered Apple world.

Later at Google, he tried to move Swift out of its Apple comfort zone, to make it become a replacement for Python in AI model development. Sadly it did not receive the support it needed from either Apple or from Google, and it was not ultimately successful.

Having said that, whilst at Google Chris did develop another project which became hugely successful: MLIR.

Mojo as a syntactic sugar for MLIR

So, if Swift was “syntax sugar for LLVM”, what’s “syntax sugar for MLIR”? The answer is: Mojo! Mojo is a brand new language that’s designed to take full advantage of MLIR. And also Mojo is Python. Maybe it’s better to say Mojo is Python++.

It will be (when complete) a strict superset of the Python language. But it also has additional functionality so we can write high performance code that takes advantage of modern accelerators.

Mojo seems to me like a more pragmatic approach than Swift. Whereas Swift was a brand new language packing all kinds of cool features based on latest research in programming language design, Mojo is, at its heart, just Python.

The idea of a fast mode

A key trick in Mojo is that you can opt in at any time to a faster “mode” as a developer, by using “fn” instead of “def” to create your function. In this mode, you have to declare exactly what the type of every variable is, and as a result Mojo can create optimised machine code to implement your function. Furthermore, if you use “struct” instead of “class”, your attributes will be tightly packed into memory, such that they can even be used in data structures without chasing pointers around. These are the kinds of features that allow languages like C to be so fast, and now they’re accessible to Python programmers too – just by learning a tiny bit of new syntax.

Alternatives

Julia is the strongest alternative to Mojo, but there are also alternatives within the Python ecosystem:

However, all these alternatives have some weaknesses compared to Mojo:

  • Solutions that uses XLA (Jax, TensorFlow and PyTorch Compiler) creates some magic black box code that is hard to debug and troubleshoot. Also these are specific to Machine Learning and not general purposes
  • Cython doesn’t have the capability to easily target accellerators, nor can Numba, which supports CUDA atleast

According to Jeremy Howard, Mojo is the best solution because it provides the familiar Python syntax but C performance, without magic layers, and solve the deployment problem as any program can be bundled into a standalone executable.