IITKGP

Research Areas

My research lies in the theoretical foundations of reinforcement learning and multi-armed bandits, with a primary focus on sequential decision-making under uncertainty. Grounded in stochastic inference, information theory, and coding-theoretic methods, my work aims to characterize the fundamental limits of learning and to design provably efficient algorithms with strong statistical guarantees.

A central theme of my research is the analysis of exploration–exploitation trade-offs using tools from probability theory, concentration inequalities, and information-theoretic lower bounds. I study regret minimization and best-arm identification problems in stochastic and structured bandit models, with an emphasis on tight finite-time performance guarantees and matching minimax lower bounds. This perspective highlights how information acquisition, uncertainty quantification, and adaptive sampling jointly govern learning efficiency under partial feedback.

An important direction of my current and future work concerns distributed and decentralized learning, including distributed multi-armed bandits and federated learning setups. Leveraging my background in coding and communication theory, I investigate how communication constraints, information compression, and limited feedback impact learning performance in networked environments. My goal is to develop algorithms that are statistically optimal while being communication-efficient, and to characterize fundamental trade-offs between regret, communication cost, and scalability in large-scale learning systems.

At IIT Kharagpur, my research aims to strengthen the theoretical foundations of bandits and reinforcement learning, mentor students in rigorous mathematical analysis, and foster interdisciplinary collaborations across AI, mathematics, and systems research.

  • Almost Cost-Free Communication in Federated Best Arm Identification by Reddy K. S., P. , Tan V. Y. The 37th AAAI international Conference on Artificial Intelligence 8378-8385 (2023)
  • Best arm identification in restless Markov multi-armed bandits by P., Reddy K. S., Tan V. Y. IEEE Transactions on Information Theory 69 3240-3262 (2023)
  • Rate-memory trade-off for multi- access coded caching with uncoded placement by Reddy K. S., Karamchandani N. IEEE Transactions on Communications 68 3261-3274 (2020)
  • Structured index coding problems and multi-access coded caching by Reddy K. S., Karamchandani N. IEEE Journal on Selected Areas in Information Theory 2 1266-1281 (2021)
  • Resource pooling in large-scale content delivery systems by Reddy K. S., Moharir S. , Karamchandani N. IEEE Transactions on Communications 68 1617-1630 (2020)
  • Co-Principal Investigator
No Record Found.