Q-Learning in Repeated Price Competition: Analyzing Convergence & Equilibrium Outcomes

Author: 
Matthew Byung Hyun Nam
Adviser(s): 
Philipp Strack
Abstract: 

Over recent years, the adoption of autonomous pricing algorithms has grown in various markets—from the hospitality industry to digital e-commerce—raising concerns about whether reinforcement learning-based algorithms can learn to collude. I analyze Q-learning, a model-free reinforcement learning approach, in repeated games of Bertrand oligopoly competition and assess its ability to learn collusive behavior. My research builds upon previous work, studying the properties of convergence and equilibrium behavior resulting from certain Q-learning parametrizations. I also present new findings on how alternative Q-learning specifications behave in multi-agent and single-agent environments, testing the performance of theoretical requirements for Q-learning convergence in the stationary context to more complex, nonstationary ones that model real-world economic environments. My analysis finds that while previous work has shown that multiple, independent Q-learners will converge to strategies that resemble reward-punishment schemes—and thus can be argued to learn to collude—Q-learning convergence is highly dependent on agents’ parametrizations, and convergence occurs mainly from a failure to learn rather than a strong ability to learn. I also find that Q-learning agents that satisfy theoretical requirements for convergence tend to perform poorly as pricing algorithms in practice, failing to converge and learn equilibrium play within realistic time frames in both single and multiple-agent environments. This is particularly true for more complex games and multi-agent settings. Overall, my work outlines the limitations that Q-learners face in converging to equilibrium play in games of price competition. Because current legislation does not comprehensively address tacit collusion by autonomous pricing algorithms, my results have implications for competition policy and the development of effective regulations in the context of algorithmic tacit collusion.

Term: 
Spring 2023