The self-interest theory

First we distinguish between formal and substantive aims. Formal aims are “act morally” and “act rationally”, and are essentially meta-ethical principles; substantive aims are the concrete realizations of the formal aims according to particular theories of morality or rationality.

It strikes me as somewhat suspect to put “act rationally” in parallel with “act morally”. It ought to be sufficient to say “act as morally as possible”; as possible is the entire criterion for rational action. If I have no moral aims at all, there is no sense in which I can act irrationally; rational behavior is defined with respect to some objective.

One possible substantive aim is that people should act in their self-interest. This might mean various things (the details don’t matter). In summary we say simply that each person’s “supremely rational ultimate aim” is that their life go as well as possible (for some definition of well).

I don’t understand why the modifier “rational” is necessary to describe this ultimate aim. It seems like it should be sufficient to say that preferences over world states are well-ordered, and there is some dominating set of world states which a person hopes to bring about.

How S can be indirectly self-defeating

A theory T is indirectly self-defeating if, when someone tries to achieve their T-given aims, those aims are worse achieved (compared to a world in which they made no effort at all). Self-defeat is defined with respect to an individual (more properly an individual and an environment), and is not a universal property of moral theories. Indirect self-defeat might happen because an actor is incompetent (and unable to effectively achieve their own aims)—this case is uninteresting (here DP asserts that the self-interest theory is “not too difficult to follow”).

The claim that at choosing the self-interest-maximizing action is easy seems outrageous—often, such choices are PSPACE-complete! Am I missing something? We can design optimization problems that are arbitrarily hard; recognizing a good course of action is easier than constructing one (though not always itself easy).

The more interesting case is where the actor comes to a worse outcome by effectively pursuing a moral theory. A couple of examples are given, but the prototype here is the prisoner’s dilemma. In particular, the self-interest theory is indirectly self-defeating for an agent who always defects, and who advertises to all partners that he will defect.

W/r/t the prisoner’s dilemma, it is possible (though quite strange) to imagine an agent that is constitutionally a dominant-strategy player—we have to assert that all kinds of pre-commitment mechanisms (like hiring someone to murder them if they ever defect) are totally off-limits. So to the extent that we are ultimately concerned with human morals, this example seems unhelpful. I don’t think there are any healthy humans that are totally incapable of cooperating under any circumstances.

Does S tell us to be never self-denying?

Under the self-interest theory, rationality is precisely the condition of always acting in one’s self-interest. A rational agent ought to maintain only those beliefs and goals that further their self-interest. These beliefs might be irrational! If for some reason I am happier believing in Russell’s teapot than not, I should do whatever is necessary to believe in the teapot. Or, with an agent for whom (rationally) always following self-interest leads to worse outcomes, such an agent ought not to behave rationally.

It seems that we’ve misread DP w/r/t the prisoner’s dilemma discussion. Previously we imagined an obligate dominant strategy player, and observed that it achieved a worse outcome than it might if it were not obligated to play the dominant strategy. Now this agent is apparently choosing between (“rationally”) playing the dominant strategy and (“irrationally”) cooperating. Obviously if cooperating leads to better outcomes, it is rational. A few explanations of what might be going on here:

  1. DP is wrong about game theory
  2. DP imagines a (more-and-more strangely-constructed) agent who wrongly understands what it means to be rational, and thinks that choosing to be rational requires defection. (We previously dismissed such agents as uninteresting.)

More generally, we seem to be making a distinction between “meta-level” rationality (being rational w/r/t/ choice of decisionmaking procedures) and “ordinary” rationality (being rational w/r/t non-meta decisions). This distinction seems arbitrary and unhelpful, but also necessary to explain why it might be “rational to make myself irrational”.

Recall that rationality is a formal, not substantive aim; it is a means to achieve the goals of self-interest, but not (necessarily) itself part of those goals. S, coupled with a particular theory of rationality, says that the supremely rational disposition is to be never self-denying, but that the aims of S are better achieved by not holding this disposition.

If this is the only point of the whole preceding discussion, then maybe the problems we’ve raised don’t much matter. The confusion is that DP means by “rational” something other than the standard economic definition. From now on I will denote DP’s rationality “P-rationality” to distinguish it from the ordinary kind.

Why S does not fail in its own terms

So does S, by being indirectly self-defeating, fail in its own terms? No, only theories that are directly self-defeating. Because rationality is a formal aim, it is not the case that being indirectly self-defeating requires that we behave P-rationally—in fact, it requires that we don’t!

But can we actually choose whether or not to behave P-rationally? There are actually two pieces to this:

  1. Do I believe that “rational” means “P-rational”
  2. Must I act in a way that I believe to be rational? (Can I change my disposition?)

Suppose I cannot change my disposition. Then it is the case that my disposition tells me to act rationally. So I should simply change my belief about rationality to be something other than P-rationality, and there’s no problem. Suppose instead that I can change my disposition without changing my belief about rationality. Then I should simply change my position, and there’s no problem. The final possibility is that I can change neither my belief nor my disposition; we will return to this case.

My difficulty up to here has been the notion of a belief about rationality: rationality should come before belief, and is a framework for producing true beliefs. But of course this is not true in the real world! Otherwise we wouldn’t have whole internet communities devoted to changing people’s belief about rationality. The fact that DP takes P-rationality to be the default belief is a little strange, but at this point it’s a socialogical claim rather than a philosophical one, so there’s no trouble yet.

Could it be rational to cause oneself to act irrationally?

Now we get an example of a thrid belief about rationality. Suppose a criminal breaks into my house, and threatens to harm me if I don’t give him my gold. It is rational to give him my gold, but even better if I can make myself immune to his demands. One way to do this is to temporarily render myself completely irrational. Here rationality tells me that I should (at least temporarily) render myself non-rational rather than P-rational or rational.

Note that this is equivalent to a pre-commitment scheme, in which I pre-commit to not changing my behavior in response to torture. It is nullified if the criminal says something like “I will harm you if you fail to give me your gold or you adopt any pre-commitment scheme.” Are all changes in one’s belief about rationality equivalent to pre-commitment schemes?

Here’s an equivalent perspective: given a fixed objective function (a “theory” to DP), rationality tells me how to best maximize that objective function. In the robber case, it is useful for me to temporarily force myself to maximize a different objective function (we can take “irrational behavior” to correspond to irrationality with a constant objective). (A pre-commitment scheme is a special case of this, where I assign negative utility to some outcomes.) Is it the case that for any strategy, there is always some objective that will cause me to exhibit that strategy when behaving rationally? This seems like the sort of thing that economists have already proved; if it’s true, then I can always behave rationally and rely on pre-commitment / objective changes to obtain the same result that DP discusses.

All of this assumes that I am totally free to adopt pre-commitment schemes, which may not be true in practice.

How S implies that we cannot avoid acting irrationally