CitedEvidence
User Settings
Article

Scalable cooperative multiagent reinforcement learning in the context of an organization

1

TL;DRAbstract

Reinforcement learning techniques have been successfully used to solve single agent optimization problems but many of the real problems involve multiple agents, or multi-agent systems. This explains the growing interest in multi-agent reinforcement learning algorithms, or MARL. To be applicable in large real domains, MARL algorithms need to be both stable and scalable. A scalable MARL will be able to perform adequately as the number of agents increases. A MARL algorithm is stable if all agents (eventually) converge to a stable joint policy. Unfortunately, most of the previous approaches lack at least one of these two crucial properties. This dissertation proposes a scalable and stable MARL framework using a network of mediator agents. The network connections restrict the space of valid policies, which reduces the search time and achieves scalability. Optimizing performance in such a system consists of optimizing two subproblems: optimizing mediators' local policies and optimizing the

Chat with Paper

AI Agents for this Paper

Reinforcement learning techniques have been successfully used to solve single agent optimization problems but many of the real problems involve multiple agents, or multi-agent systems. This explains the growing interest in multi-agent reinforcement learning algorithms, or MARL. To be applicable in large real domains, MARL algorithms need to be both stable and scalable. A scalable MARL will be able to perform adequately as the number of agents increases. A MARL algorithm is stable if all agents (eventually) converge to a stable joint policy. Unfortunately, most of the previous approaches lack at least one of these two crucial properties. This dissertation proposes a scalable and stable MARL framework using a network of mediator agents. The network connections restrict the space of valid policies, which reduces the search time and achieves scalability. Optimizing performance in such a system consists of optimizing two subproblems: optimizing mediators' local policies and optimizing the

Keywords

Reinforcement learningScalabilityComputer scienceDistributed computingContext (archaeology)Multi-agent systemArtificial intelligence

Chat

Click to start Chat