Setting Population Payoff Via Transition Function in Stochastic Game

Biheng Zhou,Jing Zhang, Chuang Deng,Zhihai Rong

2024 4th International Conference on Control Theory and Applications (ICoCTA)（2024）

College of Information Science and Technology | Shanghai Aerospace Electronic Technology Institute

Cited 0|Views1

Abstract

Stochastic game is a crucial framework for studying strategic interactions and environmental uncertainty where payoffs change due to variations in the environment. Exploring the relationship between payoffs and the environmental transition functions is crucial for understanding stochastic games. This paper explores a method to set population long-term payoff through designing transition function in a stochastic game with two environments. Based on the zero-determinant theory, we analyze the feasible region for the equalizer property of transition function, which can set opponents’ payoff via transferring between favorable and unfavorable environment. It is validated in the evolutionary dynamics of finite populations with Fermi dynamics that the population’s payoff is pinned at a expected value. Under the established transition function, the population’s payoff is unaffected by the selection intensity, but the selection intensity can alter the frequency of cooperation in the population. Our results extend the zero-determinant theory in stochastic games and further reveal that setting payoff can, independent of selection intensity, anchor the average payoff of the population at a theoretical value. This deepens our understanding of the control of payoffs in stochastic games, and offers a new perspective on the evolution of population payoff.

Translated text

Key words

Prisoner’s dilemma gam,stochastic gam,evolutionary dynamic

求助PDF

上传PDF

Bibtex

AI Read Science

AI Summary

AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.

Example

Background

Key content

Introduction

Methods

Results

Related work

Fund

Key content

Pretraining has recently greatly promoted the development of natural language processing (NLP)
We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance

Upload PDF to Generate Summary

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Data Disclaimer

The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn

Chat Paper

【要点】：本文提出了一种通过设计转移函数在随机博弈中设定群体长期收益的方法，扩展了零确定性理论，并揭示了在不依赖选择强度的情况下锚定群体平均收益的理论值。

【方法】：基于零确定性理论，分析了转移函数的平衡性质可行区域，通过在有利和不利环境之间的转换来设定对手的收益。

【实验】：通过在具有费米动力学的有限群体进化动态中进行验证，发现群体的收益被固定在预期值。在建立的转移函数下，群体的收益不受选择强度影响，但选择强度可以改变群体中合作行为的频率。实验使用的数据集未在文中明确提及。