I received my PhD from the University of Texas at Austin, where I was advised by Dr. Keshav Pingali. I received my masters from the Indian Institute of Science, where I was advised by Dr. Uday Bondhugula.
My research interests are broadly in the field of programming languages and systems, with an emphasis on optimizing compilers and runtime systems for distributed and heterogeneous architectures. Currently, my focus is on building large-scale graph databases and systems for machine learning. In recent work, I have built programming languages, compilers, and runtime systems for distributed, heterogeneous graph processing and privacy-preserving neural network inferencing. This work has been published in PLDI, ASPLOS, VLDB, IPDPS, PPoPP, PACT, and IISWC. A brief summary of my past work can be found below.
Distributed and Heterogeneous Graph Processing: Unstructured datasets are used by applications in areas such as machine learning, data mining, bioinformatics, network science, and security. These datasets may be represented as graphs and these graphs may have billions of nodes and trillions of edges. One way to process such large datasets is to use distributed clusters. For graph analytical applications, I have designed and built programming systems that exploit domain knowledge to partition graphs and optimize communication, while providing application-specific fault-tolerance. Existing shared-memory graph analytics frameworks or applications can use our system to scale out to distributed CPUs and GPUs. [PLDI 2018, ASPLOS 2019, VLDB 2018, PPoPP 2019, PACT 2019, IPDPS 2020, IPDPS 2019, IPDPS 2018, Euro-Par 2018, HPEC GraphChallenge 2019, BigData 2017]
Single Machine Graph Processing: Graph pattern mining applications are used in chemical engineering, bioinformatics, and social sciences. I have worked on building a high-level programming language and runtime system that can execute such applications on shared-memory CPUs or a GPU. For graph analytical applications, I have worked on improving the performance of graph analytics systems on byte-addressable memory like Intel Optane DC Persistent Memory. I have also compared different language abstractions and runtime systems for graph analytics on shared-memory CPUs, and identified their performance bottlenecks. [VLDB 2020, VLDB 2020, IISWC 2020, IISWC 2020]
Privacy-Preserving Neural Network Inferencing: In many applications, privacy of the datasets used must be preserved. Fully-Homomorphic Encryption (FHE) enables computation on encrypted data without requiring the secret key. However, cryptographic domain expertise is required to use FHE. In my view, FHE libraries are akin to specialized parallel architectures. I have developed an optimizing compiler for translating tensor programs like neural network inferencing to run on encrypted data using FHE libraries efficiently, while guaranteeing security and accuracy of the computation. [PLDI 2020, PLDI 2019]
Affine Loop Nests: Any sequence of arbitrarily nested loops with affine bounds and accesses are known as regular or affine loop nests. Examples include stencil-style computations, linear algebra kernels, and alternating direction implicit integrations. During my masters, I developed compiler techniques using the polyhedral model to automatically extract tasks from sequential affine loop nests and dynamically schedule them to run on distributed CPUs and GPUs with efficient data movement code. I also helped generate distributed-memory code for loop nests that contain both affine and irregular accesses. [PACT 2013, PPoPP 2015, TOPC 2016]
- I’m serving on the Program Committee for IPDPS 2021.
- Our study on Matrix-based vs. Graph-based APIs was accepted to IISWC 2020.
- Our study on Performance of Graph Analytics Frameworks was accepted to IISWC 2020.
- I’m serving on the External Review Committee for ASPLOS 2021.
- Our paper on Encrypted Vector Arithmetic (EVA) Language and Compiler for Homomorphic Computation was accepted to PLDI 2020.
- Our paper on analyzing massive graph datasets using Intel Optance DC Persistent Memory was accepted to PVLDB 13(8) 2020 (Read about our work in the SIGARCH blog).
- Our paper on Pangolin Graph Pattern Mining framework on a CPU or GPU was accepted to PVLDB 13(8) 2020 (Read about our work in the UTCS blog).
- Our work on GraphAny2Vec Distributed Training of Embeddings using Gluon is now on arxiv.
- Our Graph Analytics on Distributed GPUs paper was accepted to IPDPS 2020.
- Our work on an Adaptive Load Balancer on GPUs is now on arxiv.
- Our paper on Distributed Triangle Counting on GPUs (DistTC) was accepted to IEEE HPEC GraphChallenge 2019 and it won the Student Innovation Award.
- Our paper on Bulk-Asynchronous Gluon system for distributed and heterogeneous graph analytics was accepted to PACT 2019 and was Nominated for Best Paper.
- I presented our Gluon substrate work at a mini-symposium in SIAM CSE 2019.
- Our Compiler for Fully-Homomorphic Neural-Network Inferencing (CHET) was accepted to PLDI 2019.
- Our Customizable Streaming Edge Partitioner (CuSP) paper was accepted to IPDPS 2019.
- Our Partitioning Policies paper was accepted to PVLDB 12(4) 2018.
- Our Min-Rounds Betweenness Centrality (MRBC) paper was accepted to PPoPP 2019 (Read about our work in the R&D magazine article).
- Our Phoenix resilience paper was accepted to ASPLOS 2019.
- My “Partitioning Policies for Distributed Graph Analytics” poster won IPDPS 2018 Outstanding Poster Presentation Award, 3rd Place.
- Our Abelian compiler paper was accepted to EuroPar 2018.
- Our Gluon substrate paper was accepted to PLDI 2018.
- Our Light-weight Communication Interface (LCI) paper was accepted to IPDPS 2018.