Publications

2025

On Learning Informative Trajectory Embeddings for Imitation, Classification and Regression. Zichang Ge, Changyu Chen, Arunesh Sinha, Pradeep Varakantham. In Proceedings of International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2025.
EduQate: Generating Adaptive Curricula through RMABs in Education Settings. Sidney Tio, Dexun Li, Pradeep Varakantham. In Proceedings of International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2025.
Offline Safe Reinforcement Learning Using Trajectory Classification. Ze Gong, Akshat Kumar, Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2025. Oral Presentation.
Marginal Benefit Driven RL Teacher for Unsupervised Environment Design. Dexun Li, Wenjun Li, Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), 2025. Oral Presentation.
Unlocking the Planning Capabilities of LLMs Through Maximum Diversity Fine-Tuning. Wenjun Li, Changyu Chen, Pradeep Varakantham. In Proceedings of North American Chapter of the Association for Computational Linguistics (NAACL), 2025.
Bootstrapping Language Models with DPO Implicit Rewards. Changyu Chen, Zichen Liu, Chao Du, Tianyu Pang, Qian Liu, Arunesh Sinha, Pradeep Varakantham, Min Lin. In Proceedings of International Conference on Learning Representations (ICLR), 2025.
On Semantic Loss-Guided Data-Efficient Supervised Fine-Tuning for Safe Responses in LLMs. Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham. In Proceedings of International Conference on Learning Representations (ICLR), 2025.
On Minimizing Adversarial Counterfactual Error in Adversarial Reinforcement Learning. Roman Belaire, Arunesh Sinha, Pradeep Varakantham. In Proceedings of International Conference on Learning Representations (ICLR), 2025.
On Generalization Within Multi-Objective Reinforcement Learning Algorithms. Jayden Teoh, Pradeep Varakantham, Peter Vamplew. In Proceedings of International Conference on Learning Representations (ICLR), 2025.

2024

Unsupervised Training Sequence Design: Efficient and Generalizable Agent Training. Wenjun Li, Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2024.
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning. Huy Hoang, Tien Mai, Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2024.
Handling Long and Richly Constrained Tasks through Constrained Hierarchical Reinforcement Learning. Yuxiao Lu, Arunesh Sinha, Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2024.
Reward Penalties on Augmented States for Solving Richly Constrained RL Effectively. Hao Jiang, Tien Mai, Pradeep Varakantham, Huy Hoang. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2024.
Regret-based Defense in Adversarial Reinforcement Learning. Roman Belaire, Pradeep Varakantham, Thanh Nguyen, David Lo. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2024.
Imitating Cost-Constrained Behaviors in Reinforcement Learning. Qian Shao, Pradeep Varakantham, Shih-Fen Cheng. In 34th International Conference on Automated Planning and Scheduling, ICAPS 2024.
Preserving the Privacy of Reward Functions in MDPs through Deception. Shashank Reddy Chirra, Pradeep Varakantham, Praveen Paruchuri. In Proceedings of European Conference on Artificial Intelligence (ECAI), Oct 2024.
Improving Environment Novelty Quantification for Effective Unsupervised Environment Design. Jayden Teoh, Wenjun Li, Pradeep Varakantham. In Proceedings of Conference on Neural Information Processing Systems (NeurIPS), 2024. Oral Presentation.
Safety through feedback in Constrained RL. Shashank Chirra, Pradeep Varakantham, Praveen Paruchuri. In Proceedings of Conference on Neural Information Processing Systems (NeurIPS), 2024.
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning. Hoang Minh Huy, Mai Anh Tien, Pradeep Varakantham. In Proceedings of Conference on Neural Information Processing Systems (NeurIPS), 2024.
IRL for Restless Multi-armed Bandits with Applications in Maternal and Child Health. Gauri Jain, Pradeep Varakantham, Aparna Taneja, Haifeng Xu, Prashant Doshi, Milind Tambe. In Proceedings of Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2024. Best Paper Runner up.

2023

A Fair Incentive Scheme for Community Health Workers. Bose Avinandan, Li Tracey, Sinha Arunesh, Mai Tien. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2023.
Constrained Reinforcement Learning in Hard Exploration Problems. Pankayaraj Pathmanathan and Pradeep Varakantham. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2023.
Future Aware Pricing and Matching for Sustainable On-demand Ride Pooling. Xianjie Zhang, Pradeep Varakantham and Hao Jiang. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2023.
Planning and Learning for Non-Markovian Negative Side Effects Using Finite State Controllers. Aishwarya Srivastava, Sandhya Saisubramanian, Praveen Paruchuri, Akshat Kumar, Shlomo Zilberstein. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2023.
Safe Delivery of Critical Services in Areas with Volatile Security Situation via a Stackelberg Game Approach. Mai Tien, Sinha Arunesh. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2023.
Knowledge Compilation for Constrained Combinatorial Action Spaces in Reinforcement Learning. Jiajing Ling, Moritz Lukas Schuler, Akshat Kumar and Pradeep Varakantham. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2023.
Avoiding Starvation of Arms in Restless Multi-Armed Bandits. Dexun Li and Pradeep Varakantham. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2023.
Strategic Planning for Flexible Agent Availability in Large Taxi Fleets. Rajiv Ranjan Kumar, Pradeep Varakantham and Shih-fen Cheng. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2023.
Imitating Opponent to Win: Adversarial Policy Imitation Learning in Two-player Competitive Games. The Viet Bui, Tien Mai, Thanh H. Nguyen. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2023.
Imitation Improvement Learning for Large-scale Capacitated Vehicle Routing Problems. The Viet Bui, Tien Mai. In Proceedings of International Conference on Automated Planning and Scheduling, ICAPS 2023.
Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side Effects. Siow Meng Low, Akshat Kumar and Scott Sanner. In Proceedings of International Conference on Automated Planning and Scheduling, ICAPS 2023.
Generalization through Diversity: Improving Unsupervised Environment Design. Wenjun Li, Pradeep Varakantham, Dexun Li. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2023.
Transferable Curricula through Difficulty-Conditioned Generators. Sidney Tio, Pradeep Varakantham. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2023.
On Sustainable Ride Pooling through Conditional Expected Value Decomposition. Avinandan Bose, Hao Jiang, Pradeep Varakantham, Ge Zichang. European conference on artificial intelligence (ECAI) 2023 conference.
FlowPG: Action-constrained Policy Gradient with Normalizing Flows. Janaka Chathuranga Brahmanage, Jiajing Ling, Akshat Kumar. In the 37th Conference on Neural Information Processing Systems (NeurIPS), Dec 2023.
Learning to Search Feasible and Infeasible Regions of Routing Problems with Flexible Neural k-Opt. Yining MA, Zhiguang CAO, and Yeow Meng CHEE. In the 37th Conference on Neural Information Processing Systems (NeurIPS), Dec 2023.
Generative Modelling of Stochastic Actions with Arbitrary Constraints in Reinforcement Learning. Changyu Chen, Ramesha Karunasena, Thanh Hong Nguyen, Arunesh Sinha, Pradeep Varakantham. In the 37th Conference on Neural Information Processing Systems (NeurIPS), Dec 2023.

2022

Field Study in Deploying Restless Multi-Armed Bandits: Assisting Non-Profits in Improving Maternal and Child Health. Aditya Mate, Lovish Madan, Aparna Taneja, Neha Madhiwalla, Shresth Verma, Gargi Singh, Aparna Hegde, Pradeep Varakantham and Milind Tambe. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2022.
Sample-Efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs. Siow Meng Low, Akshat Kumar, Scott Sanner. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2022.
Multiscale Generative Models: Improving Performance of a Generative Model Using Feedback from Other Dependent Generative Models. C. Chen, A. Bose, S. Cheng, A. Sinha. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2022.
Choices are not independent: Stackelberg Security Games with Nested Quantal Response Models. Tien Mai Anh and Arunesh Sinha. In Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Feb 2022.
Hierarchical Value Decomposition for Effective On-Demand Ride Pooling. Jiang Hao and Pradeep Varakantham. In Proceedings of the joint conference on Autonomous Agents and Multi-Agent Systems, AAMAS 2022.
Joint Pricing and Matching for City-Scale Ride Pooling. Sanket Shah, Meghna Lowalekar and Pradeep Varakantham. In Proceedings of International Conference on Automated Planning and Scheduling, ICAPS 2022.
Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits. Dexun Li and Pradeep Varakantham. In Proceedings of Uncertainty in Artificial Intelligence, UAI 2022.
Undiscounted Recursive Path Choice Models: Convergence Properties and Algorithms. Mai Anh Tien and E. Frejinger. Transportation Science, 2022.
Scalable Distributional Robustness in a Class of Non-Convex Optimization with Guarantees. Bose Avinandan, Sinha Arunesh, Mai Tien. In the 36th Conference on Neural Information Processing Systems (NeurIPS), Dec 2022.