Homepage

I’m a researcher in the Systems Research Group at Microsoft Research. I received my PhD from the University of Texas at Austin, where I was advised by Dr. Keshav Pingali. I received my masters from the Indian Institute of Science, where I was advised by Dr. Uday Bondhugula.

Research Interests

My research interests are broadly in the field of programming languages and systems, with an emphasis on optimizing compilers and runtime systems for distributed and heterogeneous architectures. My current focus is on building efficient systems for AI. In the past, I have built programming languages, compilers, and runtime systems for distributed, heterogeneous graph processing and privacy-preserving neural network inferencing. This work has been published in PLDI, ASPLOS, VLDB, IPDPS, PPoPP, PACT, and IISWC. A brief summary of my past work can be found below.

Distributed and Heterogeneous Graph Processing: Unstructured datasets are used by applications in areas such as machine learning, data mining, bioinformatics, network science, and security. These datasets may be represented as graphs and these graphs may have billions of nodes and trillions of edges. One way to process such large datasets is to use distributed clusters. For graph analytical applications, I have designed and built programming systems that exploit domain knowledge to partition graphs and optimize communication, while providing application-specific fault-tolerance. Existing shared-memory graph analytics frameworks or applications can use our system to scale out to distributed CPUs and GPUs. [ASPLOS 2024, PLDI 2018, ASPLOS 2019, VLDB 2018, PPoPP 2019, PACT 2019, IPDPS 2021, IPDPS 2020, IPDPS 2019, IPDPS 2018, Euro-Par 2018, HPEC GraphChallenge 2019, BigData 2017]

Single Machine Graph Processing: Graph pattern mining applications are used in chemical engineering, bioinformatics, and social sciences. I have worked on building a high-level programming language and runtime system that can execute such applications on shared-memory CPUs or a GPU. For graph analytical applications, I have worked on improving the performance of graph analytics systems on byte-addressable memory like Intel Optane DC Persistent Memory. I have also compared different language abstractions and runtime systems for graph analytics on shared-memory CPUs, and identified their performance bottlenecks. [VLDB 2020, VLDB 2020, ICS 2021, IISWC 2020, IISWC 2020]

Privacy-Preserving Neural Network Inferencing: In many applications, privacy of the datasets used must be preserved. Fully-Homomorphic Encryption (FHE) enables computation on encrypted data without requiring the secret key. However, cryptographic domain expertise is required to use FHE. In my view, FHE libraries are akin to specialized parallel architectures. I have developed an optimizing compiler for translating tensor programs like neural network inferencing to run on encrypted data using FHE libraries efficiently, while guaranteeing security and accuracy of the computation. I also designed a new encrypted vector arithmetic language for developing general-purpose FHE applications and built an optimizing compiler that generates correct and secure programs, while hiding all the complexities of the target FHE scheme. [PLDI 2020, PLDI 2019]

Affine Loop Nests: Any sequence of arbitrarily nested loops with affine bounds and accesses are known as regular or affine loop nests. Examples include stencil-style computations, linear algebra kernels, and alternating direction implicit integrations. During my masters, I developed compiler techniques using the polyhedral model to automatically extract tasks from sequential affine loop nests and dynamically schedule them to run on distributed CPUs and GPUs with efficient data movement code. I also helped generate distributed-memory code for loop nests that contain both affine and irregular accesses. [PACT 2013, PPoPP 2015, TOPC 2016]

News