Soham Govande

Soham Govande

CS @ Stanford | OpenAI | Prev. @ NVIDIA

About Me

I'm a junior at Stanford University pursuing a BS in CS (AI) and a MS in CS (systems). I'm interested in the intersection of machine learning and systems. My recent work has included hardware-aware model optimization at NVIDIA and Stanford AI Lab, and this summer, I'll be joining OpenAI's infrastructure team. I'm very thankful to the amazing mentors I've had throughout the way, including Dan Fu, Ethan He, and Jan-Philipp Fränken.

Projects

GitHubarXiv:2506.03275

Blog: I, II, & III

ES-FoMo@ICML2025, YPS@MLSys2025

Introduces hardware-aware dynamic sparsity patterns and optimized attention + GEMM CUDA kernels to selectively recompute rapidly-changing activations, accelerating diffusion transformers by up to 3.7x without retraining.

Reverse engineered the matrix core register layouts on AMD Instinct GPUs, and implemented core primitives from the ThunderKittens framework. Results in a 10-line functional GEMM. In progress & very experimental!

Contact

govande at stanford dot edu