Install Theme

Deborah Mayo’s “error statistics” sounds very interesting, and looks like it could be a superior alternative to both frequentism and Bayesianism.  Sort of a version of frequentism that explicitly incorporates the process of data generation and experimental design, and thus (reportedly) avoids some of the ugly oddities of frequentism.  But when I try to read her book on it, it’s just … so … boring.

I wonder whether this is why it has not gotten very much attention.  Error statistics needs an advocate with flair – a Jaynes.

Hey @raginrayguns, any thoughts on this?  (It’s mega-harsh on Jaynes)

raginrayguns:

shlevy:

Why is it taken as obvious that EV is the right way to go for decisionmaking? Even taking as a given that you can come up with a meaningful calculation of probability and utility of the different outcomes, it’s not clear looking at a single case why I should care what the weighted average of utility across outcomes would be. If I’m only ever going to do one EV calculation in my life, and choice A has a 25% chance of 100 utils and a 75% chance of -10 utils, it seems reasonable to me to say “I should generally expect to lose on this path, not win”. It’s only if you hit the large number regime when you can use the EV of a single case as a sort of short hand for “if I hold to this strategy over time, in most outcomes the wins will overcome the losses”, but how often do people actually do EV?

In science and business it’s used as a tool to make your computer calculate a plan with outcomes that you want, which does not require perfect agreement between the EV choices and the best choices

(re: how often: it’s used in some bayesian clinical trials to assign patients to treatments, and I met a guy whose PhD thesis is on algorithms to prioritize power lines to fix after a hurricane, which I understood to be EV calculations)

it’s also used as a model for actual decision makers, though I think it’s acknowledged as not always the best one, for example sometimes people use prospect theory

(re: how common, I’ve heard very common in economics)

so, it’s used, but these uses are not because the doctors, economists etc. think EV defines the “right way to go.” But there’s arguments for that too.

My preferred way to define “the right way to go” here, is “rationality upon reflection”. Like, you reflect on your preferences, and maybe you decide that they’re awful, which means they’re not rational upon reflection. There are arguments that EV choices are the rational-upon-reflection ones.

This is actually not very useful if true because

  • in science and business, when calculating plans, we know there’s going to be mismatch between the EV choices and the best choices, because we oversimplify the utility function and the probability distributions. This is true whether or not some other utility and probability would work perfectly
  • in any individual case, we can decide if the EV conclusion is rational on reflection by calculating it, and reflecting. If this always works, it’d be nice to know, but you can always try it
  • and obv as a model for people, people don’t have the rational-upon-reflection preferences

Someone who has a lot of reason to care, though, is Eliezer @yudkowsky, since he wants models of self-modifying decision-making programs. He thinks that such programs are accurately modeled, after their initial stages, only by theories which produce rational-upon-reflection conclusions. Because, I guess, the program is doing a lot of reflecting? I recall him saying he had to develop timeless decision theory, because a ai foom would only be described by existing decision theories for at most a few seconds.

rational-upon-reflection choices are sort of normative, because you can use them as a guide for your own actions, but I don’t like to put it that way. And sometimes the use is not normative, for example EY is using them as a model to predict reflective decision-makers

Anyway, I say there are “arguments” for it but I only know a single argument, which is based on Savage’s theorem. Savage’s theorem is something like, “for any set of preferences that obeys constraints R, there exists a utility function U and a probability distribution P such that R is EV with U and P. Also, given U there’s just one P, and vice versa.” So, such an argument can’t establish the right utility function or the right probability distribution, though constraints on one give info about the other. By a set of “preferences” I really mean a set of like hypothetical choices you’d make, like “when given a choice between doing A and doing B I choose doing B”. This is a little different than what we usually mean, usually by preferences people mean preferences between outcomes, but this is preferences between acts.

So, the argument based on this theorem is something like

  1. One of two things is true by Savage’s theorem:
    1. the correct-upon-reflection preferences are EV
    2. the correct-upon-reflection preferences violate some constraint in R
  2. But surely, if we were reflecting and we noticed the violation of that constraint, we would alter our preferences to repair it, since violating that constraint is really irrational seeming
  3. Therefore the correct-upon-reflection preferences are EV

A description of the theorem: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.541&rep=rep1&type=pdf

Like Jaynes’s Cox’s theorem argument, there’s some tricky infinities stuff going on. The theorem only works if you have this sort of continuous space of possible acts, and your preference partial order defined for all of them. But of course there’s only a finite set of acts that we actually care about. This is analogous to how Jaynes’s argument requires a sort of continuum of “plausibilities”, all of which are possessed by some proposition, in order to derive with Cox’s theorem that the plausibilities are isomorphic to probabilities. When there’s only finitely many propositions, Halpern has constructed a counterexample to Cox’s theorem. @scientiststhesis argues that if you have a coin and believes about how any series of flips might come out, there’s your continuum of plausibilities (or not really continuum?, but there’s a plausibility between every pair of plausibilities, which might be enough for cox’s theorem, I’m not sure.) I would counterargue, but we don’t really care if we violate our rationality constraints for those, and the set of propositions we actually care about getting our beliefs right for is finite. But he may be right if we’re talking about rational-upon-reflection beliefs for all propositions, rather than just the ones we care about. In any case, to quote Halpern’s paper: “Another possibly interesting line of research is that of characterizing the functions that satisfy Cox’s assumptions. As the example given here shows, the class of such functions includes functions that are not isomorphic to any probability function. I conjecture that in fact it includes only functions that are in some sense ‘close’ to a function isomorphic to a probability distribution, although it is not clear exactly how ‘close should be defined (nor how interesting this class really is in practice)”. Well maybe not in practice, but in philosophy it’d be interesting, if having beliefs about lots of propositions means that the rational-upon-reflection beliefs must be close to obeying probability theory. So, something similar might be true for Savage’s theorem: maybe even with finite numbers of possible acts we need preferences between, having enough of these acts means that our preferences are… close, somehow, to EV?

Certainly, though, if you’re only going to make 1 choice, the argument doesn’t apply. So maybe the Savage’s theorem argument isn’t any more general than the averaging-out argument. Though perhaps it applies to cases where 1 choice outweighs all the rest in consequences? But perhaps not. One advantage it has, though, is that it applies when you’d only make 1 choice, but you don’t know which choice it’s going to be, so you have a rich set of preferences

Anyway, some of these constraints R imply these infinities, but what about the other constraints? Does violating them really make preferences irrational upon reflection? I don’t know, because I just haven’t examined them that closely.

AFAIK, Savage’s Theorem is only necessary if you want to infer EV from preferences alone, without even assuming the agent knows about or believes in any probabilities.  If you already know the probabilities of various outcomes – which seems like the case @shlevy is interested in? – then all you need is the VNM theorem, which works with finitely many acts / outcomes.

The VNM axioms only say that there is some function whose expectation your preferences maximize.  In @shlevy‘s example, preferring “don’t do A” to “do A” is consistent with maximizing the expectation of a function, just one that isn’t the same as the function the example refers to as “utils.”

(via raginrayguns)

metagorgon:

nostalgebraist:

Solomonoff’s induction problem answers all three of the above questions in a simplified setting: The set of world models is any computable environment (e.g., any Turing machine). In reality, the simplest hypothesis that predicts the data is generally correct, so agents are evaluated against a simplicity distribution. Agents are scored according to their ability to predict their next observation. These answers were insightful, and led to the development of many useful tools, including algorithmic probability and Kolmogorov complexity.

(MIRI technical agenda)

oh, so that’s why the solomonoff prior is the ultimate correct perfect one, because it’s … based on simplicity.  and simple hypotheses are good, “in reality” (??).  thanks,……….

It’s a formalization of Occam’s Razor. This is really uncharitable. Please read about Solomonoff induction directly instead of a summary by MIRI if you’re so confused.

Like, this isn’t even a MIRI thing.

I have read about Solomonoff induction elsewhere, including Solomonoff’s original papers.  (I admit I didn’t read them straight through checking every step, but I got the flavor of them.  Solomonoff emphasizes the fact that his methods seems to agree with intuition on workable test problems, rather than the mere fact that they formalize Occam’s Razor, which in some sense is true of any method that penalizes model complexity.)

What I am specifically sniping at is the fact that Yudkowsky tends to take “Solomonoff induction is the perfect model of induction” as a given, without justifying it or referring to other justifications.  I have long been confused by this, as have some others (see e.g. this good post by @raginrayguns).

(via metagorgon-deactivated20181211)

Solomonoff’s induction problem answers all three of the above questions in a simplified setting: The set of world models is any computable environment (e.g., any Turing machine). In reality, the simplest hypothesis that predicts the data is generally correct, so agents are evaluated against a simplicity distribution. Agents are scored according to their ability to predict their next observation. These answers were insightful, and led to the development of many useful tools, including algorithmic probability and Kolmogorov complexity.

(MIRI technical agenda)

oh, so that’s why the solomonoff prior is the ultimate correct perfect one, because it’s … based on simplicity.  and simple hypotheses are good, “in reality” (??).  thanks,……….

Responses to this post

@princess-stargirl

I agree with @jadagul ‘s post. This is why I included the line: “also it is, of course, wrong to assume any individual person will assign consistent probabilities to events. By “consistent” I mean consistent with what the person knows not what some aliens know.“

If you ask someone for a bunch of related probability estimates those estimates are almost certainly not going to self consistent (with respect to the knowledge the person actually has). If the lt it is sufficiently long it is almost impossible to avoid inconsistent statements even if you are trying very hard to avoid being inconsistent. (caveat: Of course you could just say 0% to everything if this is allowed. But I am assuming one is at least somewhat trying to say reasonable things).

However like @slatestarscratchpad I don’t really see why this is a fundamental problem.

@speakertoyesterday

I feel I’m losing sight of the argument. I agree humans are bad at dealing reasonably with conjunctions when assigning probabilities/bad with dealing with really small numbers/really big ones. But this doesn’t mean that we can’t present conditionals in decibans instead of asking about massive conjunctions.

Just because my intuition is wrong about the probability of conjunctions doesn’t mean probability is the wrong measure for degrees of belief. It might just mean that my intuition fails in predictable ways, and I think we both agree that that is the case.

But it seems to me you might be arguing for something more, namely that this calls into question Jayne’s argument that probability is a degree of belief.

Ah, yeah, I do think it’s a fundamental problem, and I do think it’s a problem for “probability is a degree of belief” in general, not just for humans.

Recall the definition of a probability space.  It has three things.  First, a sample space Ω, which is like a set of points, each of which is an “outcome.”  Second, an event space F, which contains various sets of outcomes, called “events.”  Third, a probability measure P, which assigns probabilities to events from F.  (Then P is supposed to satisfy some axioms.)

Usually when we talk about “using probability,” we talk about cases where we are at least given Ω and F.  (Maybe we have to determine P ourselves.)  In really simple cases, like coin flips or (literal) bets, it’s pretty easy to write down exactly what Ω and F are, so all we need to worry about is P.

However, the issue I am raising here is about Ω and F.  If you reflect on your beliefs, generally you don’t have anything that actually looks like these sets.  You already know about some possible outcomes, and you have grouped them into some sets.  But your mental picture of Ω and F is not complete, and you know this: you know, for instance, that you will sometimes hear about new possible outcomes you had never thought of.  And when you are presented with an event (as with @jadagul‘s example), often you can think of some of the outcomes that would be inside or outside it, but you have no complete picture of Ω on which you could “draw” the set corresponding to the event.

What you do in practice (or at least what I do) is, when thinking about an event, try to come up with some other events that are necessary or sufficient conditions for it, and think about those – that is, I make a “made-to-order” ad hoc picture of a relevant part of Ω, and think about some events defined on it.  Generally these ad hoc pictures aren’t all going to agree with one another.  In any case, this sort of reasoning is a very different thing from standard “probabilistic reasoning” where you know Ω and F.

One consequence of this is that it will be hard to assign probabilities that obey the probability axioms.  This itself isn’t the real problem, though.  The real problem is that you wouldn’t be able to do this even if you were perfect at probability.  What I mean by this is, if you were some Jaynes-bot who always applied the probability axioms perfectly, but you still had the same incomplete knowledge of Ω and F, you would still make the same mistakes.  The Jaynes-bot could have various rules for dealing with its incomplete knowledge, which might be different from a human’s rules, but if can’t know more about Ω and F than a human would if we are using it as an ideal for human reasoning.

(This is completely different from the “conjunction fallacy” of the Linda the Bankteller example, where it’s absolutely clear that one event is a subset of another, but people don’t take that to its logical conclusion.  A Jaynes-bot would have no problem with that one.)

The relevance to “should degrees of belief be represented as probabilities?” is that in practice we generally do not have an actual probability space.  So we have to ask if “probability-like numbers” on an incomplete probability space are correct for our purpose.

jadagul:

slatestarscratchpad:

nostalgebraist:

slatestarscratchpad:

nostalgebraist:

Replies to this post

@princess-stargirl

The questioner is just talking none-sense. He is assuming, seemingly for no reason, that you know that AC=>BC. If he does not make this false, completely unwarranted, assumption notice strange happens at all.

(also it is, of course, wrong to assume any individual person will assign consistent probabilities to events. By “consistent” I mean consistent with what the person knows not what some aliens know. )

@unknought

I’m not really a Bayesian, but the obvious response is that learning that AC implies BC should cause you to update the probability of AC downwards and the probability of BC upwards.

Knowing that a statement is a named conjecture tells you something about how likely it is to be true. Knowing that it’s a named conjecture with a stronger form which is also a named conjecture tells you more.

The “questioner” framework was just a story.  The real situations I am thinking about here are ones in which there is no one around to tell you that AC=>BC.  Either you simply don’t know it, or you could figure it out but may not currently realize it, or know it but just don’t have it in mind.

I think I was actually clearer in my earlier post on this subject, so maybe I should have just found and linked that.  Or maybe I should rewrite that one.

I just tried to type a clear explanation of my point and failed and deleted it, which is frustrating because it’s clear in my head.  I think I’m just too tired.

But my point is something like: if you only have a hazy sense of the area you’re talking about, you’ll be tempted to give most events a sort of “reasonable, conservative” probability estimate, if someone asks you about them or you happen to ask yourself.  But since there is all this implication structure (or just non-independence) among the various events, you’re actually implying a whole bunch of nontrivial things by doing this.  You may not know about this structure, or only know about some of it, or know about a lot of it but not be able to hold it all in your head at the same time (quick, name every necessary condition for “cupcakes are still being sold in 2050″).

This seems wrong to me. Maybe I am misunderstanding it. Let me try to get my head around it and explain where I’m coming from.

Suppose that Omega comes to Alice and says: “Here is a coin that I have biased to either always come up heads or always come up tails. I will not tell you which. But I will tell you this. I know every truth about human history, and I have arranged for this coin to be biased heads if 9-11 was truly an inside job, and tails otherwise.”

Maybe Alice believes there was only a 0.1% chance that 9-11 was an inside job, so she believes 99.9% chance a flip will land tails, and 0.1% chance it will land heads.

Now Alice goes to Bob, who knows nothing about any of this, and says “What’s your probability distribution over how this coin lands when I flip it in a second?” Bob says “Fifty-fifty heads/tails, obviously.”

Alice says “Aha, so you think there’s an equal chance that 9-11 was or wasn’t an inside job? You’re pretty dumb.”

Obviously this is unfair to Bob. But equally obviously, Bob did exactly the right thing, from his position, to say that the coin flip odds were 50-50.

But I feel like you’re doing the same thing. Confronting the human for having beliefs that implied surprising facts about the relationship between the alien conjectures, makes no more sense than confronting Bob for having beliefs that imply surprising facts about the US government’s relationship with terrorism. But that means your argument proves too much - it suggests we can’t even assign a probability of 50% to a coin toss.

It seems like the example I gave was really bad for communicating what I actually wanted to communicate.

The alien conjectures are meant to be an extreme case of “a situation where we have incomplete information about the dependence relationships between different events.”  (In the example, we had no information.)  The example is meant to distill a phenomenon that happens in other, less contrived situations.

I’m too tired right now to come up with a good single example, but the prototype case I have in mind is relatively ordinary statements about the future, like “cupcakes are still being sold in 2050.”  This seems pretty likely, and at first glance I’d just give it a probability I associate with “pretty likely.”  But then you can make various more specific statements, involving people in 2050 doing things with cupcakes they’ve bought, which also seem “pretty likely,” but technically require the first statement and should have lower probability than it, unless they absolutely must happen if cupcakes are still sold.

All these statements are sort of “hidden conjunctions,” which depend on all sorts of prerequisites, many of which may not come to mind directly when thinking about the statement.  When everything’s a conjunction of things you’re very unsure about, which are themselves conjunctions of things you’re very unsure about, etc., it becomes hard to keep the probabilities ordered in a way that respects this structure.

Are you just saying most people are bad and inconsistent in their probability assignments and commit the conjunction fallacy? Such that someone might assign 20% to “Linda is a feminist bank teller” but only 10% to “Linda is a bank teller” which would be absurd? If so I agree but I think that’s an argument that humans are bad at this, not that it’s not a good theoretical framework.

I think the point is that without a fairly detailed grasp of the situation, there’s no way your credence assignments aren’t going to lead to all sorts of conjunction fallacies. Nostalgebraist is trying to give some examples where this is really clear, but it winds up making the examples not be very convincing.

I think a better example is the statement: “California will (still) be a US state in 2100.” Where if you make me give a probability I’ll say something like “Almost definitely! But I guess it’s possible it won’t. So I dunno, 98%?”

But if you’d asked me to rate the statement “The US will still exist in 2100”, I’d probably say something like “Almost definitely! But I guess it’s possible it won’t. So I dunno, 98%?”

And of course that precludes the possibility that the US will exist but not include California in 2100.

And for any one example you could point to this as an example of “humans being bad at this”. But the point is that if you don’t have a good sense of the list of possibilities, there’s no way you’ll avoid systematically making those sorts of errors.

Consider the following list of statements: 1) in 2100, the US will exist. 2) In 2100, the US will contain states. 3) In 2100, the US will contain states west of the Mississippi. 4) In 2100, the US will contain states west of the Rockies. 5) In 2100, the US will contain California.

In my judgment, all of those statements are “almost certainly true.” And there’s content to that, as a matter of “giving credence to propositions about the future.” But if you want me to assign “probabilities” then you want me to assign numbers to all of those statements in a way that’s consistent across all those statements. And there’s no possible way to do that unless you have a list of all the possible propositions.

Try it. And then ask what you think the probability is that in 2100, the US contains any states bordering the Pacific.

Thank you – this is what I meant, and your examples are better.  (When I’m less busy/tired I’ll see whether I can come up with more examples.)

(There’s another point I want to make, but I’m still not able to make it clear to my satisfaction.  To sketch it: if you try to be “clever” and avoid this by listing lots of propositions like this and inserting little probability gaps, these little gaps may add up to a non-trivial gap between the most general statement and the least general, which you have no justification for except “I wanted to avoid making this mistake.”  The only real cure is having an actual picture of the different 2100s that are possible and querying them each in turn, i.e. “having a detailed knowledge of the situation”)

(via jadagul)

slatestarscratchpad:

nostalgebraist:

Replies to this post

@princess-stargirl

The questioner is just talking none-sense. He is assuming, seemingly for no reason, that you know that AC=>BC. If he does not make this false, completely unwarranted, assumption notice strange happens at all.

(also it is, of course, wrong to assume any individual person will assign consistent probabilities to events. By “consistent” I mean consistent with what the person knows not what some aliens know. )

@unknought

I’m not really a Bayesian, but the obvious response is that learning that AC implies BC should cause you to update the probability of AC downwards and the probability of BC upwards.

Knowing that a statement is a named conjecture tells you something about how likely it is to be true. Knowing that it’s a named conjecture with a stronger form which is also a named conjecture tells you more.

The “questioner” framework was just a story.  The real situations I am thinking about here are ones in which there is no one around to tell you that AC=>BC.  Either you simply don’t know it, or you could figure it out but may not currently realize it, or know it but just don’t have it in mind.

I think I was actually clearer in my earlier post on this subject, so maybe I should have just found and linked that.  Or maybe I should rewrite that one.

I just tried to type a clear explanation of my point and failed and deleted it, which is frustrating because it’s clear in my head.  I think I’m just too tired.

But my point is something like: if you only have a hazy sense of the area you’re talking about, you’ll be tempted to give most events a sort of “reasonable, conservative” probability estimate, if someone asks you about them or you happen to ask yourself.  But since there is all this implication structure (or just non-independence) among the various events, you’re actually implying a whole bunch of nontrivial things by doing this.  You may not know about this structure, or only know about some of it, or know about a lot of it but not be able to hold it all in your head at the same time (quick, name every necessary condition for “cupcakes are still being sold in 2050″).

This seems wrong to me. Maybe I am misunderstanding it. Let me try to get my head around it and explain where I’m coming from.

Suppose that Omega comes to Alice and says: “Here is a coin that I have biased to either always come up heads or always come up tails. I will not tell you which. But I will tell you this. I know every truth about human history, and I have arranged for this coin to be biased heads if 9-11 was truly an inside job, and tails otherwise.”

Maybe Alice believes there was only a 0.1% chance that 9-11 was an inside job, so she believes 99.9% chance a flip will land tails, and 0.1% chance it will land heads.

Now Alice goes to Bob, who knows nothing about any of this, and says “What’s your probability distribution over how this coin lands when I flip it in a second?” Bob says “Fifty-fifty heads/tails, obviously.”

Alice says “Aha, so you think there’s an equal chance that 9-11 was or wasn’t an inside job? You’re pretty dumb.”

Obviously this is unfair to Bob. But equally obviously, Bob did exactly the right thing, from his position, to say that the coin flip odds were 50-50.

But I feel like you’re doing the same thing. Confronting the human for having beliefs that implied surprising facts about the relationship between the alien conjectures, makes no more sense than confronting Bob for having beliefs that imply surprising facts about the US government’s relationship with terrorism. But that means your argument proves too much - it suggests we can’t even assign a probability of 50% to a coin toss.

It seems like the example I gave was really bad for communicating what I actually wanted to communicate.

The alien conjectures are meant to be an extreme case of “a situation where we have incomplete information about the dependence relationships between different events.”  (In the example, we had no information.)  The example is meant to distill a phenomenon that happens in other, less contrived situations.

I’m too tired right now to come up with a good single example, but the prototype case I have in mind is relatively ordinary statements about the future, like “cupcakes are still being sold in 2050.”  This seems pretty likely, and at first glance I’d just give it a probability I associate with “pretty likely.”  But then you can make various more specific statements, involving people in 2050 doing things with cupcakes they’ve bought, which also seem “pretty likely,” but technically require the first statement and should have lower probability than it, unless they absolutely must happen if cupcakes are still sold.

All these statements are sort of “hidden conjunctions,” which depend on all sorts of prerequisites, many of which may not come to mind directly when thinking about the statement.  When everything’s a conjunction of things you’re very unsure about, which are themselves conjunctions of things you’re very unsure about, etc., it becomes hard to keep the probabilities ordered in a way that respects this structure.

(via slatestarscratchpad)

Replies to this post

@princess-stargirl

The questioner is just talking none-sense. He is assuming, seemingly for no reason, that you know that AC=>BC. If he does not make this false, completely unwarranted, assumption notice strange happens at all.

(also it is, of course, wrong to assume any individual person will assign consistent probabilities to events. By “consistent” I mean consistent with what the person knows not what some aliens know. )

@unknought

I’m not really a Bayesian, but the obvious response is that learning that AC implies BC should cause you to update the probability of AC downwards and the probability of BC upwards.

Knowing that a statement is a named conjecture tells you something about how likely it is to be true. Knowing that it’s a named conjecture with a stronger form which is also a named conjecture tells you more.

The “questioner” framework was just a story.  The real situations I am thinking about here are ones in which there is no one around to tell you that AC=>BC.  Either you simply don’t know it, or you could figure it out but may not currently realize it, or know it but just don’t have it in mind.

I think I was actually clearer in my earlier post on this subject, so maybe I should have just found and linked that.  Or maybe I should rewrite that one.

I just tried to type a clear explanation of my point and failed and deleted it, which is frustrating because it’s clear in my head.  I think I’m just too tired.

But my point is something like: if you only have a hazy sense of the area you’re talking about, you’ll be tempted to give most events a sort of “reasonable, conservative” probability estimate, if someone asks you about them or you happen to ask yourself.  But since there is all this implication structure (or just non-independence) among the various events, you’re actually implying a whole bunch of nontrivial things by doing this.  You may not know about this structure, or only know about some of it, or know about a lot of it but not be able to hold it all in your head at the same time (quick, name every necessary condition for “cupcakes are still being sold in 2050″).

slatestarscratchpad:

chroniclesofrettek:

…… Everyone gets that when rationalists/bayesians say “bet” they mean take an action with some expected value, right? You’re making a “bet” by donating to MIRI/AMF/Whatever, you’re making a “bet” when you start a retirement account. The idea of “don’t bet” maps pretty well to the precautionary principle and has the same problems.

@slatestarscratchpad is the person who makes this point the best (with his “which unprediction should I make” type of discussion) but is there any blog post or something which spells out this out more explicitly?

Is http://slatestarcodex.com/2015/08/24/probabilities-without-models/ what you’re looking for?

I’m aware of this usage but I don’t like it, because the Dutch book argument has less force (if any) outside of cases that aren’t literal bets, since in a literal betting situation the other person wants to get money from you and will tend to make Dutch books if they exist.  Eliding the difference between literal bets and other decisions under uncertainly makes the Dutch book argument look stronger than it is.

Re: Scott’s post (tangentially), the important distinction for me is not whether we have a model, but whether we have some representation of the outcome space and the events defined on it.

I guess since I keep muttering “outcome space” mysteriously, I should recap what I’m thinking of.  So, suppose you’re being asked about some domain of mathematics you know nothing about.  Maybe it’s done by aliens or something.

The first question you’re asked is, “what’s the probability that Arghzlargh’s Conjecture is true?”  You have no clue what that is, but you give some answer between 0 and 1, let’s call it “p.”

Then: “what’s the probability that Bargorgh’s Conjecture is true?”  No idea what that one is either.  Since there is nothing distinguishing the two, it’d be absurd to answer anything but “p” here.

Your questioner is astonished.  “You mean you have a proof that Arghzlargh’s Conjecture and Bargorgh’s Conjecture are equivalent?  Amazing!”

You are confused, so the questioner explains.  In fact, it’s a known theorem that Arghzlargh’s Conjecture (AC) implies Bargorgh’s Conjecture (BC).  So P(AC & BC) = P(AC).  But you said P(AC) = P(BC) = p, so therefore P(AC & BC) = P(BC), meaning that BC implies AC as well: the two are equivalent.

Your commenter goes on to say that this disproves Cormlearm’s Conjecture (CC), which was precisely (BC & ~AC).  That is, you have implied that P(CC) = 0.

But your amazing “result” might easily not have been.  For imagine running a different version of the experiment.  The first question you’re asked is “what’s the probability that Cormlearm’s Conjecture is true?”

Would you answer zero?  Of course not – you would answer “p,” because at this point you know nothing about “Cormlearm’s Conjecture.”  Innocuous enough, for now

(via slatestarscratchpad)