About us
We are systems nerds with commercial focus. The team comes from NVIDIA, Stanford, YC, and Apple. We work at every level of the stack by:
- Writing CUDA to push towards speed-of-light performance on GPUs
- Digging into the guts of inference engines like SGLang to maximize efficiency
- Distributing work across providers to maximize robustness and fleet utilization
- Using spot compute when it's available, and safely failing over to more reliable compute when it's not
If this all sounds exciting, and it's hard to pick just one, then we're just like you. Let's talk!
Why
Agents are getting really good at running autonomously. In 2026, they will be good enough to make meaningful, independent progress on hard problems, given sufficient tokens. Our job is to dramatically increase intelligence per dollar, and make sure no compute goes to waste.