My research interests span across sequential decision making and fairness in machine learning.
13
Scopus Publications
1000
Scholar Citations
12
Scholar h-index
12
Scholar i10-index
Scopus Publications
Multi-armed Bandits with Generalized Temporally-Partitioned Rewards Ronald C. van den Broek, Rik Litjens, Tobias Sagis, Nina Verbeeke, Pratik Gajane Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2024
WeHeart: A Personalized Recommendation Device for Physical Activity Encouragement in Cardiac Rehabilitation Rosa Van Tuijn, Tianqin Lu, Emma Driesse, Koen Franken, Pratik Gajane, et al. Frontiers in Artificial Intelligence and Applications, 2023 We introduce WeHeart, a personalized recommendation device that aims to gradually increase physical activity levels in cardiac rehabilitation. The importance of physical activity in cardiac rehabilitation as a means of reducing associated morbidity and mortality rates is well-established. However, forming physical activity habits is a challenge, and the approach varies depending on individual preferences. Our solution employs a Random Forest classification model that combines both measured and self-reported data to provide personalized recommendations. We also propose to make use of Explainable AI to improve transparency and foster trust.
Autonomous Exploration for Navigating in MDPs Using Blackbox RL Algorithms Pratik Gajane, Peter Auer, Ronald Ortner Ijcai International Joint Conference on Artificial Intelligence, 2023 We consider the problem of navigating in a Markov decision process where extrinsic rewards are either absent or ignored. In this setting, the objective is to learn policies to reach all the states that are reachable within a given number of steps (in expectation) from a starting state. We introduce a novel meta-algorithm which can use any online reinforcement learning algorithm (with appropriate regret guarantees) as a black-box. Our algorithm demonstrates a method for transforming the output of online algorithms to a batch setting. We prove an upper bound on the sample complexity of our algorithm in terms of the regret bound of the used black-box RL algorithm. Furthermore, we provide experimental results to validate the effectiveness of our algorithm and correctness of our theoretical results.
LEMON: Alternative Sampling for More Faithful Explanation Through Local Surrogate Models Dennis Collaris, Pratik Gajane, Joost Jorritsma, Jarke J. van Wijk, Mykola Pechenizkiy Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2023 Local surrogate learning is a popular and successful method for machine learning explanation. It uses synthetic transfer data to approximate a complex reference model. The sampling technique used for this transfer data has a significant impact on the provided explanation, but remains relatively unexplored in literature. In this work, we explore alternative sampling techniques in pursuit of more faithful and robust explanations, and present LEMON: a sampling technique that samples directly from the desired distribution instead of reweighting samples as done in other explanation techniques (e.g., LIME). Next, we evaluate our technique in a synthetic and UCI dataset-based experiment, and show that our sampling technique yields more faithful explanations compared to current state-of-the-art explainers.
The Impact of Batch Learning in Stochastic Linear Bandits Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein Proceedings IEEE International Conference on Data Mining Icdm, 2022 We consider a special case of bandit problems, named batched bandits, in which an agent observes batches of responses over a certain time period. Unlike previous work, we consider a more practically relevant batch-centric scenario of batch learning. That is to say, we provide a policy-agnostic regret analysis and demonstrate upper and lower bounds for the regret of a candidate policy. Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior. Primarily, we study two settings of the stochastic linear bandits: bandits with finitely and infinitely many arms. While the regret bounds are the same for both settings, the former setting results hold under milder assumptions. Also, we provide a more robust result for the 2-armed bandit problem as an important insight. Finally, we demonstrate the consistency of theoretical results by conducting empirical experiments and reflect on optimal batch size choice.
Gambler bandits and the regret of being ruined Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems Aamas, 2021
Variational regret bounds for reinforcement learning 35th Conference on Uncertainty in Artificial Intelligence Uai 2019, 2019
Adaptively Tracking the Best Bandit Arm with an Unknown Number of Distribution Changes Proceedings of Machine Learning Research, 2019
Variational Regret Bounds for Reinforcement Learning Proceedings of Machine Learning Research, 2019
Achieving Optimal Dynamic Regret for Non-stationary Bandits without Prior Information Proceedings of Machine Learning Research, 2019
Corrupt Bandits for Preserving Local Privacy Proceedings of Machine Learning Research, 2018
A relative exponential weighing algorithm for adversarial utility-based dueling bandits 32nd International Conference on Machine Learning Icml 2015, 2015
RECENT SCHOLAR PUBLICATIONS
Best-of-Both-Worlds Multi-Dueling Bandits: Unified Algorithms for Stochastic and Adversarial Preferences under Condorcet and Borda Objectives S Akash, P Gajane, J Singh arXiv preprint arXiv:2603.18972 , 2026 2026
Evaluating Causal Discovery Algorithms for Path-Specific Fairness and Utility in Healthcare N Nagesh, E Khatibi, T Hughes, M Bagheri, P Gajane, AM Rahmani arXiv preprint arXiv:2603.15926 , 2026 2026
Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards. RC van den Broek, R Litjens, T Sagis, L Siecker, N Verbeeke, P Gajane 22nd Symposium on Intelligent Data Analysis (IDA) , 2024 2024
Investigating Gender Fairness in Machine Learning-driven Personalized Care for Chronic Pain P Gajane, S Newman, JD Piette https://arxiv.org/abs/2402.19226 , 2024 2024 Citations: 4
Provably Efficient Exploration in Constrained Reinforcement Learning: Posterior Sampling Is All You Need D Provodin, P Gajane, M Pechenizkiy, M Kaptein arXiv preprint arXiv:2309.15737 , 2023 2023
University Teaching Qualification Basiskwalificatie Onderwijs (BKO) Teaching Portfolio P Gajane 2023
Autonomous Exploration for Navigating in MDPs Using Blackbox RL Algorithms. P Gajane, P Auer, R Ortner IJCAI, 3714-3722 , 2023 2023 Citations: 1
WeHeart: A Personalized Recommendation Device for Physical Activity Encouragement and Preventing “Cold Start” in Cardiac Rehabilitation PGEB Rosa van Tuijn, Tianqin Lu, Emma Driesse, Koen Franken Human-Computer Interaction – INTERACT 2023. 14144 (Lecture Notes in Computer … , 2023 2023 Citations: 3
LEMON: Alternative sampling for more faithful explanation through local surrogate models D Collaris, P Gajane, J Jorritsma, JJ van Wijk, M Pechenizkiy International Symposium on Intelligent Data Analysis, 77-90 , 2023 2023 Citations: 13
Multi-Armed Bandits with Generalized Temporally-Partitioned Rewards RC van den Broek, R Litjens, T Sagis, L Siecker, N Verbeeke, P Gajane 16th European Workshop on Reinforcement Learning (EWRL), arXiv: 2303.00620 , 2023 2023 Citations: 1
Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning J Li, P Gajane 16th European Workshop on Reinforcement Learning (EWRL) , 2023 2023 Citations: 15
Local Differential Privacy for Sequential Decision Making in a Changing Environment P Gajane Fourth AAAI Workshop on Privacy-Preserving Artificial Intelligence , 2023 2023 Citations: 1
Industrializing Deep Reinforcement Learning for ASML’s Service Network JFJ van der Haar, IRJIR Basten, WW van Jaarsveld, PP Gajane, ... Master’s thesis, Eindhoven University of Technology, Eindhoven, The Netherlands , 2023 2023 Citations: 1
The impact of batch learning in stochastic linear bandits D Provodin, P Gajane, M Pechenizkiy, M Kaptein 2022 IEEE International Conference on Data Mining (ICDM), 1149-1154 , 2022 2022 Citations: 6
Generalizing distribution of partial rewards for multi-armed bandits with temporally-partitioned rewards RC Broek, R Litjens, T Sagis, L Siecker, N Verbeeke, P Gajane arXiv preprint arXiv:2211.06883 , 2022 2022
An Empirical Evaluation of Posterior Sampling for Constrained Reinforcement Learning D Provodin, P Gajane, M Pechenizkiy, M Kaptein Reinforcement Learning for Real Life Workshop , 2022 2022
Survey on fair reinforcement learning: Theory and practice P Gajane, A Saxena, M Tavakol, G Fletcher, M Pechenizkiy arXiv preprint arXiv:2205.10032 , 2022 2022 Citations: 38
The impact of batch learning in stochastic bandits D Provodin, P Gajane, M Pechenizkiy, M Kaptein Workshop on Ecological Theory of Reinforcement Learning , 2021 2021 Citations: 2
Gambler bandits and the regret of being ruined FS Perotto, S Vakili, P Gajane, Y Faghan, M Bourgais 20th International Conference on Autonomous Agents and Multiagent Systems … , 2021 2021 Citations: 7
MOST CITED SCHOLAR PUBLICATIONS
On formalizing fairness in prediction with machine learning P Gajane, M Pechenizkiy the 5th Workshop on Fairness, Accountability, and Transparency in Machine … , 2018 2018 Citations: 350
Adaptively tracking the best bandit arm with an unknown number of distribution changes P Auer, P Gajane, R Ortner Conference on learning theory, 138-158 , 2019 2019 Citations: 185
Variational regret bounds for reinforcement learning R Ortner, P Gajane, P Auer Uncertainty in Artificial Intelligence, 81-90 , 2020 2020 Citations: 80
A sliding-window algorithm for markov decision processes with arbitrarily changing rewards and transitions P Gajane, R Ortner, P Auer Lifelong Learning: A Reinforcement Learning Approach Workshop at FAIM , 2018 2018 Citations: 69
A relative exponential weighing algorithm for adversarial utility-based dueling bandits P Gajane, T Urvoy, F Clérot International Conference on Machine Learning, 218-227 , 2015 2015 Citations: 64
Corrupt bandits for preserving local privacy P Gajane, T Urvoy, E Kaufmann Algorithmic Learning Theory, 387-412 , 2018 2018 Citations: 47
Achieving optimal dynamic regret for non-stationary bandits without prior information P Auer, Y Chen, P Gajane, CW Lee, H Luo, R Ortner, CY Wei Conference on Learning Theory, 159-163 , 2019 2019 Citations: 39
Survey on fair reinforcement learning: Theory and practice P Gajane, A Saxena, M Tavakol, G Fletcher, M Pechenizkiy arXiv preprint arXiv:2205.10032 , 2022 2022 Citations: 38
Adaptively tracking the best arm with an unknown number of distribution changes P Auer, P Gajane, R Ortner European Workshop on Reinforcement Learning 14, 375 , 2018 2018 Citations: 34
Corrupt bandits P Gajane, T Urvoy, E Kaufmann EWRL , 2016 2016 Citations: 17
Curiosity-driven Exploration in Sparse-reward Multi-agent Reinforcement Learning J Li, P Gajane 16th European Workshop on Reinforcement Learning (EWRL) , 2023 2023 Citations: 15
LEMON: Alternative sampling for more faithful explanation through local surrogate models D Collaris, P Gajane, J Jorritsma, JJ van Wijk, M Pechenizkiy International Symposium on Intelligent Data Analysis, 77-90 , 2023 2023 Citations: 13
Utility-based dueling bandits as a partial monitoring game P Gajane, T Urvoy In the 12th European Workshop on Reinforcement Learning (EWRL), 2015 , 2015 2015 Citations: 8
Gambler bandits and the regret of being ruined FS Perotto, S Vakili, P Gajane, Y Faghan, M Bourgais 20th International Conference on Autonomous Agents and Multiagent Systems … , 2021 2021 Citations: 7
The impact of batch learning in stochastic linear bandits D Provodin, P Gajane, M Pechenizkiy, M Kaptein 2022 IEEE International Conference on Data Mining (ICDM), 1149-1154 , 2022 2022 Citations: 6
Investigating Gender Fairness in Machine Learning-driven Personalized Care for Chronic Pain P Gajane, S Newman, JD Piette https://arxiv.org/abs/2402.19226 , 2024 2024 Citations: 4
Autonomous exploration for navigating in non-stationary CMPs P Gajane, R Ortner, P Auer, C Szepesvari arXiv preprint arXiv:1910.08446 , 2019 2019 Citations: 4
Counterfactual learning for machine translation: Degeneracies and solutions C Lawrence, P Gajane, S Riezler arXiv preprint arXiv:1711.08621 , 2017 2017 Citations: 4
WeHeart: A Personalized Recommendation Device for Physical Activity Encouragement and Preventing “Cold Start” in Cardiac Rehabilitation PGEB Rosa van Tuijn, Tianqin Lu, Emma Driesse, Koen Franken Human-Computer Interaction – INTERACT 2023. 14144 (Lecture Notes in Computer … , 2023 2023 Citations: 3
Corrupt bandits for privacy preserving input P Gajane, T Urvoy, E Kaufmann arXiv preprint arXiv:1708.05033 , 2017 2017 Citations: 3