Published onNovember 23, 2025How to run custom CUDA kernels with TorchtorchcudatorchShowcase of a simple way of running custom CUDA kernels in PyTorch with extensions and a quick way of benchmarking them with respect to native functions
Published onNovember 6, 2025An 'Easy' LeetGPU problem will teach you about GPU memory hierarchycudagpuleetgpukernelsConv 1D is a simple kernel to write, however if you want to optimize it, you will learn about all layers of GPU memory hierarchy.
Published onOctober 27, 2025How CUDA Kernels are Executed on the GPU?cudagpuBeautiful graphics that explain how kernels are executed on the GPU
Published onOctober 2, 2025GPU programming learning resourcescudagpulearningtritonmojoPlace to store resources that I used, use or will use on my journey to learn GPU programming (and bunch of other stuff LLM related).