Agentic Systems

My research within agentic systems focuses on understanding and designing systems for the highly dynamic nature of these workloads and the broad, rapidly evolving design space they introduce[1].

Retrieval-Augmented Generation

My research within Retrieval-Augmented Generation has focused on designing scalable retrieval systems for efficient LLM inference. In my earlier work[1], I characterized the RAG design space and identified key bottlenecks in retrieval latency, throughput, and energy efficiency. Building on this, my recent work[2] investigates retrieval at datacenter and web scale, where prior optimization techniques designed for smaller datastores fail to address the overheads introduced by trillion-token corpora. We developed a distributed hierarchical search framework that performs targeted, parallelized retrieval across large-scale datastores.

Computer Architecture Simulation

My work within computer architecture simulator development has broadly focused on the development of accurate and efficient simulators for AMD GPU Architectures. I began by developing functional and event-driven simulators for AMD's GCN3, RDNA, and CDNA microarchitectures[1]. Building on this work, I later explored GPU hardware support for Fully Homomorphic Encryption via extensions to instruction set architectures, compute units, and memory systems to accelerate encrypted computation workloads[2].