ML Compilers & Hardware · Mosaic

Modules in this track

Foundation — LLVM IR, passes, MLIR, dialects, lowering. The substrate.
Production — torch.compile, JAX/Pallas, IREE/ExecuTorch, operator fusion. The compilers people actually run.
Kernels & Hardware — Triton, CUTLASS, ThunderKittens, the 2026 hardware landscape. Where you drop down when the compiler isn’t enough.

What you’ll be able to do after

Read MLIR dumps and understand what each dialect is doing
Write a Triton kernel and a small MLIR pass
Recognize when a workload calls for torch.compile vs hand-written kernels
Tell which hardware (Blackwell · MI355X · TPU v6 · Cerebras · Groq) is right for which workload