About Kernelize

Kernelize enables vLLM, Ollama and SGLang to target new NPU, CPU and GPU hardware devices, making AI inference significantly less expensive to run.

Our Mission

We believe that AI inference should be accessible and cost-effective across all hardware platforms. Our mission is to bridge the gap between popular inference platforms and new hardware devices, enabling developers to run their AI workloads at significantly lower cost without rewriting their entire stack.

Reduce Costs

Enable cost-effective inference with new hardware targets

Leverage Triton

Use Triton to generate optimized kernels for new hardware

Extend Platforms

Enable inference platforms to work with new hardware devices

Simon Waters

Simon Waters

Founder & CTO

Simon brings extensive experience delivering optimizing compiler products such as AMD's Triton backend and Catapult C Synthesis. His passion for innovation and collaboration have been the driving force in his career.

Bryan Bowyer

Bryan Bowyer

Head of Product

Bryan Bowyer is a seasoned technology executive with over 20 years of leadership in machine-learning and compiler engineering, previously driving open-source MLIR-based graph compilation for AMD's Neural Processing Units (NPUs).