Adil at Work: game-theory

Showing posts with label game-theory. Show all posts

Wednesday, 3 March 2010

Signalling and Signalling Games

Just did a bit of reading on the class of Game Theory games termed 'Signalling Games' and the Economics and Evolutionary Biology theories of 'Signalling'.

Tuesday, 2 March 2010

68-69, Coloured Trails (CT) game

Looked through a couple of papers describing and running experiments using the Coloured Trails (CT) game for multi-agent resource negotiation:

68, The Influence of Social Dependencies on Decision-Making: Initial Investigations with a New Game, by Barbara J. Grosz, Sarit Kraus & Shavit Talman, 2004.

69, The Effects of Goal Revelation on Computer-Mediated Negotiation, by Ya'akov Gal et al, 2009.

Implementation referenced in [69] can be found here:
http://www.eecs.harvard.edu/ai/ct

Thursday, 8 May 2008

43.2, MAS: Rational Decision Making and Negotiation

Snippets taken from slides prepared and used by Ulle Endriss to teach a "Multiagent Systems: Rational Decision Making and Negotiation" course at Imperial College London in 2005

Game Theory: Given the rules of the "game" (the negotiation mechanism, the protocol), what strategy should a rational agent adopt?

Dominant Strategies: A strategy is called dominant iff, independently of what any of the other agents do, following that strategy will result in a larger payoff than any other strategy.

Nash Equilibria: A Nash equilibrium is a set of strategies, one for each agent, such that no agent could improve its payoff by unilaterally deviating from their assigned strategy.

Monday, 26 November 2007

Intelligent Design

Source: The Economist, October 20th 2007
Section: Economics Focus
Title: Intelligent Design
Subtitle: A theory of an intelligently guided invisible hand wins the Nobel prize

... despite its dreary name, mechanism design is a hugely important area of economics, and underpins much of what dismal scientists do today. It goes to the heart of one of the biggest challenges in economics: how to arrange our economic interactions so that, when everyone behaves in a self-interested manner, the result is something we all like. The word "mechanism" refers to the institutions and the rules of the game that govern our economic activities...

Mechanism-design theory aims to give the invisible hand a helping hand, in particular by focusing on how to minimise the economic cost of "asymmetric information" - the problem of dealing with someone who knows more than you do...

His [Mr Hurwicz's] big idea was "incentive compatibility". The way to get as close as possible to the most efficient outcome is to design mechanisms in which everybody does best for themselves by sharing truthfully whatever private information they have that is asked for...

Saturday, 13 October 2007

Breaking through the Kyoto impasse

Below is a selection of passages from an article found in the September 29th 2007 issue of 'The Economist', followed by some thoughts.

Section: Economics focus
Title: Playing games with the planet
Subtitle: A version of the "prisoner's dilemma" may suggest ways to break through the Kyoto impasse

"... all countries will enjoy the benefits of a stable climate whether they have helped to bring it about or not. So a government that can persuade others to cut their greenhouse-gas emissions without doing so itself gets the best of both worlds: it avoids all the expense and self-denial involved, and yet still escapes catastrophe...

The problem, of course, is that if everyone is counting on others to act, no one will, and the consequences could be much worse than if everyone had simply done their bit to begin with. Game theorists call a simplified version of this scenario the 'prisoner's dilemma'...

Pessimistic souls assume that the international response to climate change will go the way of the prisoner's dilemma. Rational leaders will always neglect the problem, on the grounds that others will either solve it, allowing their country to become a free-rider, or let it fester, making it a doomed cause anyway. So the world is condemned to a slow roasting, even though global warming could be averted if everyone co-operated.

Yet in a recent paper, Michael Liebreich, of New Energy Finance, a research firm, draws on game theory to reach the opposite conclusion. The dynamics of the prisoner's dilemma, he points out, change dramatically if participants know that they will be playing the game more than once. In that case, they have an incentive to co-operate, in order to avoid being punished for their misconduct by their opponent in subsequent rounds.

The paper cites a study on the subject by an American academic, Robert Axelrod, which argues that the most successful strategy when the game is repeated has three elements: first, players should start out by co-operating; second, they should deter betrayals by punishing the transgressor in the next round; and third, they should not bear grudges but instead should start co-operating with treacherous players again after meting out the appropriate punishment. The result of this strategy can be sustained co-operation rather than a cycle of recrimination.

Mr Liebreich believes that all this holds lessons for the world's climate negotiators. Treaties on climate change, after all, are not one-offs. Indeed, the United Nations is even now trying to get its members to negotiate a successor to its existing treaty, the Kyoto Protocol, which expires in 2012. Many fear that the effort will collapse unless the laggards can be persuaded to join in. But the paper argues that rational countries will not be deterred by free-riders. They will continue to curb their emissions, while devising sanctions for those who do not.

Under lock and Kyoto
The Kyoto Protocol already embodies some of these elements. Countries that do not meet their commitments, for example, are supposed to be punished with a requirement to cut their emissions more sharply the next time around. But Mr Liebreich argues that there should also be sanctions for rich countries that refuse to participate, and stronger incentives for poor countries (which are exempted from any mandatory cuts) to join in...

The global regime on climate change, Mr Liebreich believes, should also be revised more frequently, to allow the game to play itself out more quickly. So instead of stipulating big reductions in emissions, to be implemented over five years, as in Kyoto, negotiators might consider adopting annual targets. That way, co-operative governments know that they cannot be taken advantage of for long, whereas free-riders can be punished and penitents brought back into the fold more quickly.

There are flaws in the analogy, of course. In the real world, governments can communicate and form alliances, which makes the dynamics of the game much more complicated. And governments may not act consistently or rationally... most countries' willingness to act is presumably linked to the severity of global warming's ill effects. If things get bad enough, then with any luck everyone will play the game."

Why is it important to me and my research? It sure makes for an interesting agent problem: A scenario of cyclic dependency, where each agent requires something from another and there is a long-term benefit for mutual collaboration but collaborating requires short-term loss and carries the risk of deceit.

Friday, 23 March 2007

12, The Evolution of Cooperation

Notes taken from ‘The Evolution of Co-operation’, by Robert Axelrod (1984)

The Prisoner’s Dilemma – a two-player game of choice; the choice to cooperate or defect at each move, with individual player payoffs depending on what both players (independently) choose, as follows:

(R’s move, C’s move, R’s payoff, C’s payoff)
Cooperate, Cooperate, R=3, R=3 (reward for mutual cooperation)
Cooperate, Defect, S=0, T=5 (sucker’s payoff and temptation to defect)
Defect, Cooperate, T=5, S=0, (temptation to defect and sucker’s payoff)
Defect, Defect, P=1, P=1 (punishment for mutual defection)

Strategy (or decision rule): A specification of what to do in any situation that might arise.

TIT FOR TAT, the strategy of starting with co-operation, and thereafter doing what the other player did on the previous move.

w is the ‘weight’ (or importance) of the next move relative to the current move. It is a ‘discount parameter’ that represents the degree to which the payoff of each move is discounted relative to the previous move.

1 + w + (w^2) + (w^3)… The sum of this infinite series for any w greater than zero and less than one is simply 1/(1-w).

(Proposition 1) If the discount parameter, w, is sufficiently high, there is no best strategy independent of the strategy used by the other player.

A strategy is ‘collectively stable’ if no strategy can invade it.

(Proposition 2) TIT FOR TAT is collectively stable iff w is large enough. This critical value of w is a function of the four payoff parameters; T, R, P and S.

(Proposition 3) Any strategy which may be the first to cooperate can be collectively stable only when w is sufficiently large.

A ‘nice’ strategy is one, such as TIT FOR TAT, which will never be the first to defect.

(Proposition 4) For a ‘nice’ strategy to be collectively stable, it must be ‘provoked’ by the first defection of the other player.

(Proposition 5) ALL D (i.e. always defect) is always collectively stable.

A strategy is ‘maximally discriminating’ if it will eventually cooperate even if the other has never cooperated yet, and once it cooperates will never cooperate again with ALL D but will always cooperate with another player using the same strategy as it uses.

p is the proportion of interactions by someone using the new strategy with another individual using the new strategy.

(Proposition 6) The strategies which can invade ALL D in a cluster with the smallest value of p are those which are maximally discriminating, such as TIT FOR TAT.

(Proposition 7) If a nice strategy cannot be invaded by a single individual, it cannot be invaded by any cluster of individuals.

How to do well in a durable iterated Prisoner’s Dilemma:
1. Don’t be envious.
2. Don’t be the first to defect.
3. Reciprocate both cooperation and defection. (Extracting more than one defection for each defection of the other risks escalation. On the other hand, extracting less than one-for-one risks exploitation.)
4. Don’t be too clever.

How to promote cooperation:
1. Enlarge the shadow of the future.
2. Change the payoffs.
3. Teach people to care about each other.
4. Teach reciprocity.
5. Improve recognition abilities.

Four factors which can give rise to interesting types of social structure:
- Labels, a fixed characteristic of a player, such as sex or skin colour, which can be observed by the other player.
- Reputation, malleable and comes into being when another player has information about the strategy that the first one has employed with other players.
- Regulation, a relationship between a government and the governed… Gives rise to the problems of just how stringent the rules and enforcement procedures should be.
- Territoriality, occurs when players interact with their neighbours rather than with just anyone.

A new strategy is introduced into one of the neighbourhoods of a population where everyone else is using a native strategy. The new strategy territorially invades the neighbourhood if every location in the territory will eventually convert to the new strategy.

A native strategy is territorially stable if no strategy can territorially invade it.

(Proposition 8) If a rule is collectively stable, it is territorially stable.

TIT FOR TAT’s robust success is due to being nice, provocable, forgiving and clear:
- Nice – it is never the first to defect, preventing it from getting into unnecessary trouble.
- Retaliation – discourages the other side from persisting whenever defection is tried.
- Forgiveness – helps restore mutual cooperation.
- Clarity – makes its behavioural pattern easy to recognise, and once recognised, it is easy to perceive that the best way of dealing with TIT FOR TAT is to cooperate with it.

Thursday, 1 February 2007

6.2, Approaches To Automated Negotiation

Notes take from Argumentation-Based Negotiation (ABN) (2003), by Iyad Rahwan et al.

1, Game-theoretic Approaches to Negotiation
Game theory is a branch of economics that studies the strategic interactions between self-interested economic agents (and recently, self-interested computational agents).

In game-theoretic analysis, researchers usually attempt to determine the optimal strategy by analysing the interaction as a game between identical participants, seeking its equilibrium…

However, classical game-theoretic approaches have some significant limitations from the computational perspective. Specifically, most these approaches assume that agents have unbounded computational resources and that the space of outcomes is completely known…

2, Heuristic-based Approaches to Negotiation
To address some of the aforementioned limitations of game-theoretic approaches, a number of heuristics have emerged. Heuristics are rules of thumb that produce good enough (rather than optimal) outcomes and are often produced in contexts with more relaxed assumptions about agents’ rationality and resources…

Disadvantages: Firstly, the models often lead to outcomes that sub-optimal because they adopt an approximate notion of rationality and because they do not examine the full space of possible outcomes. And secondly, it is very difficult to predict precisely how the system and the constituent agents will behave…

3, Argumentation-based Approaches to Negotiation
Limitations of conventional negotiation approaches:
- Agents are not allowed to exchange any additional information other than what is expressed in the proposal itself. This can be problematic, for example, in situations where agents have limited information about the environment, or where their rational choices depend on those of other agents.
- Agents’ utilities or preferences are usually assumed to be completely characterised prior to the interaction. Thus an agent is assumed to have a mechanism by which it can assess and compare any two proposals. This is not always the case…
- Agents’ preference over proposals are often assumed to be proper in the sense that they reflect the true benefit the agent receives from satisfying these preferences.
- Agents’ utilities or preferences are assumed to be fixed. One agent cannot directly influence another agent’s preference model, or any of its internal mental attitudes (e.g. beliefs, desires, goals, etc.) that generate its preference model…

Argumentation-based approaches to negotiation attempt to overcome the above limitations by allowing agents to exchange additional information, or to “argue” about their beliefs and other mental attitudes during the negotiation process.

In the context of negotiation, we view an argument as a piece of information that may allow an agent to (a) justify its negotiation stance; or (b) influence another agent’s negotiation stance.

Thus, in addition to accepting or rejecting a proposal, an agent can offer a critique of it. This can help make negotiations more efficient. By understanding why its counterpart cannot accept a particular deal, an agent may be in a better position to make an alternative offer that has a higher change of being acceptable…

Another type of information that can be exchanged is a justification of a proposal, stating why an agent made such a proposal or why the counterpart should accept it. This may make it possible to change the other agent’s region of acceptability or the nature of the negotiation space itself (by introducing new attributes/dimensions to the negotiation object)…

An agent might also make a threat or promise a reward in order to exert some pressure on its counterpart to accept a proposal…

4, Summary
There is no universal approach to automated negotiation that suits every problem domain. Rather, there is a set of approaches, each based on different assumptions about the environment and the agents involved in the interaction…

ABN frameworks are gaining increasing popularity for its potential ability to overcome the limitations of more conventional approaches to automated negotiation. However, such models are typically more complex than their game-theoretic and heuristic counterparts.

Tuesday, 23 January 2007

5, Automated Negotiation

Notes taken from ‘Automated Negotiation: Prospects, Methods and Challenges’ (2001), by N. R. Jennings et al.

“… This paper is not meant as a survey of the field of automated negotiation. Rather, the descriptions and assessments of the various approaches are generally undertaken with particular reference to work in which the authors have been involved…”

The major contribution of this paper has been to:
- Examine the space of negotiation opportunities for autonomous agents;
- Identify and evaluate some of the key techniques;
- Highlight some of the major challenges for future automated negotiation research.
- Lay the foundations for building flexible (persuasive) negotiators.
- Argue that automated negotiation is a central concern for multi-agent systems research.
- Develop a generic framework for classifying and viewing automated negotiations, and then using it to discuss and analyse the three main methods of approach that have been adopted to automated negotiation.

1, Introduction
Agent interactions can vary from simple information interchanges, to requests for particular actions to be performed and on to cooperation and coordination. However, perhaps the most fundamental and powerful mechanism for managing inter-agent dependencies at run-time is negotiation – the process by which a group of agents come to a mutually acceptable agreement on some matter…

Automated negotiation research can be considered to deal with three broad topics:
- Negotiation Protocols: the set of rules that govern the interaction…
- Negotiation Objects: the range of issues over which agreement must be reached…
- Agents’ Decision Making Models: the decision making apparatus the participants employ to act in line with the negotiation protocol in order to achieve their objectives…

2, A generic framework for automated negotiation
Negotiation can be viewed as a distributed search through a space of potential agreements… For a given negotiation, the participants are the active components that determine the direction of the search…

… The minimum requirement of a negotiating agent is the ability to make and respond to proposals… However, this can be very time-consuming and inefficient since the proposer has no means of ascertaining why the proposal is unacceptable, nor whether the agents are close to an agreement, nor in which dimension/direction of the agreement space it should move next…

… The recipient needs to be able to provide more useful feedback on the proposals it receives, in the form of a critique (comments on which parts of the proposal the agent likes or dislikes) or a counter-proposal (an alternative proposal generated in response to a proposal)…

On their own, proposals, critiques and counter-proposals mean that agents, for example, cannot justify their negotiation stance or persuade one another to change theirs. On the other hand, in argumentation-based negotiation, the negotiator is seeking to make the proposal more attractive (acceptable) by providing additional meta-level information in the form of arguments for its position.

Arguments have the potential to increase the likelihood (by persuading agents to accept deals that may previously have rejected) and/or the speed (by convincing agents to accept their opponent’s position on a given issue) of arguments being reached. Common categories of arguments include:
- Threats (failure to accept this proposal means something negative will happen to you);
- Rewards (acceptance of this proposal means something positive will happen to you);
- Appeals (you should prefer this option over that alternative for some reason).

3, Game theoretic models
Game-theory is a branch of economics that studies (analyses and formalises) interactions between self-interested agents… In order for an agent to make the choice that optimises its outcome, it must reason strategically (i.e. take into account the decisions that other agents may make, and must assume that they will act so as to optimise their own outcome).

… It turns out that the search space of strategies and interactions that needs to be considered has exponential growth, which means that the problem of finding an optimal strategy is in general computationally intractable…

Game theoretic techniques can be applied to two key problems:
1. The design of an appropriate protocol that will govern the interactions between the negotiation participants. [1]
2. The design of a particular (individual-welfare maximising) strategy (the agents’ decision making models) that individual agents can use while negotiating.

Despite the advantages, there are a number of problems associated with the use of game theory when applied to automated negotiation:
- Game theory assumes that it is possible to characterise an agent’s preferences with respect to possible outcomes… With more complex (multi-issue) preferences, it can be hard to use game theoretic techniques.
- The theory has failed to generate a general model governing rational choice in interdependent situations…
- Game theory models often assume perfect computational rationality meaning that no computation is required to find mutually acceptable solutions within a feasible range of outcomes. Furthermore, this space of possible deals (which includes the opponents’ information spaces) is often assumed to be fully known by the agents, as is the potential outcomes values… Even if the joint space is known, knowing that a solution exists is entirely different to knowing what the solution actually is.

3, Heuristic approaches
This is the major means of overcoming the aforementioned limitations of game theoretic models. Such methods acknowledge that there is a cost associated with computation and decision making and so seek to search the negotiation space in a non-exhaustive fashion. Thus aiming to produce good, rather than optimal solutions. The key advantages can be stated as follows:
- the models are based on realistic assumptions; hence they provide a more suitable basis for automation and they can, therefore, be used in a wider variety of application domains;
- the designers of agents, who are not wedded to game theory, can use alternative, and less constrained, models of rationality to develop different agent architectures.
The comparative disadvantages are:
- the models often select outcomes (deals) that are sub-optimal; this is because they adopt an approximate notion of rationality and because they do not examine the full space of possible outcomes;
- the models need extensive evaluation, typically through simulations and empirical analysis, since it is usually impossible to predict precisely how the system and the constituent agents will behave in a wide variety of circumstances.

4, Argumentation-based approaches
The basic idea is to allow additional information to be exchanged, over and above proposals. This information can be of a number of different forms, all of which are arguments which explain explicitly the opinion of the agent making the argument.

In addition to rejecting a proposal, an agent can:
- Offer a critique of the proposal, explaining why it is unacceptable (thus identifying an entire area of the negotiation space as being not worth exploring by the other agent);
- Accompany a proposal with an argument which says why the other agent should accept it (thus changing the other agent’s region of acceptability).

… Agents may not be truthful in the arguments that they generate. Thus, when evaluating an argument, the recipient needs to assess the argument on its own merits and then modify this by its own perception of the argument’s degree of credibility in order to work out how to respond.

…Using argumentation in real agents means handling the complexities of the agents’ mental attitudes, communication between agents, and the integration of the argumentation mechanisms into a complex agent architecture [3].

For the future, two main areas of work remain:
1. The definition of suitable argumentation protocols, that is, sets of rules that specify how agents generate and respond to arguments based upon what they know. [6, 7]
2. The transition between the underlying negotiation protocol and the argumentation protocol. When is the right time to make this transition, when is it right to start an argument?

… the problem with such methods is that they add considerable overheads to the negotiation process, not least in the construction and evaluation of arguments…

5, Conclusions
Much research still needs to be performed in the area of automated negotiation, including:
- Extending and developing the specific approaches that have been discussed herein and even developing new methods…
- Development of a best practice repository for negotiation techniques. That is, a coherent resource that describes which negotiation techniques are best suited to a given type of problem or domain (much like the way that design patterns function in object-oriented analysis and design)…
- Advancing work on knowledge elicitation and acquisition for negotiation behaviour. At present, there is virtually no work on how a user can instruct an agent to negotiate on their behalf…
- Developing work on producing predictable negotiation behaviour…