Research

I work on efficient training and inference for language and multimodal models. Focus areas include ZeRO-3 and FSDP training, quantization, speculative decoding, and real time speech to speech systems.

Blog

I keep short notes and deep dives on training tricks, PyTorch implementations, and systems topics.
Entropy (Blog & Notes)

Projects

Selected open source and demos.
Projects

Latest Publications