Install Theme

MIRI productivity watch: they are pivoting towards issues of ethical design in systems similar to currently existing machine learning algorithms.  The new research agenda (which will occupy “half” of their team in the coming year) looks much more tractable and much more directly connected to mainstream AI research, and they express interest in collaborating more with other AI/ML researchers.

At least superficially, this looks like an extremely good idea to me, and is better than anything I had thought MIRI was likely to do.  (It is also, IIRC, pretty much what su3 thought they ought to be doing?)

metagorgon:

nostalgebraist:

Solomonoff’s induction problem answers all three of the above questions in a simplified setting: The set of world models is any computable environment (e.g., any Turing machine). In reality, the simplest hypothesis that predicts the data is generally correct, so agents are evaluated against a simplicity distribution. Agents are scored according to their ability to predict their next observation. These answers were insightful, and led to the development of many useful tools, including algorithmic probability and Kolmogorov complexity.

(MIRI technical agenda)

oh, so that’s why the solomonoff prior is the ultimate correct perfect one, because it’s … based on simplicity.  and simple hypotheses are good, “in reality” (??).  thanks,……….

It’s a formalization of Occam’s Razor. This is really uncharitable. Please read about Solomonoff induction directly instead of a summary by MIRI if you’re so confused.

Like, this isn’t even a MIRI thing.

I have read about Solomonoff induction elsewhere, including Solomonoff’s original papers.  (I admit I didn’t read them straight through checking every step, but I got the flavor of them.  Solomonoff emphasizes the fact that his methods seems to agree with intuition on workable test problems, rather than the mere fact that they formalize Occam’s Razor, which in some sense is true of any method that penalizes model complexity.)

What I am specifically sniping at is the fact that Yudkowsky tends to take “Solomonoff induction is the perfect model of induction” as a given, without justifying it or referring to other justifications.  I have long been confused by this, as have some others (see e.g. this good post by @raginrayguns).

(via metagorgon-deactivated20181211)

Solomonoff’s induction problem answers all three of the above questions in a simplified setting: The set of world models is any computable environment (e.g., any Turing machine). In reality, the simplest hypothesis that predicts the data is generally correct, so agents are evaluated against a simplicity distribution. Agents are scored according to their ability to predict their next observation. These answers were insightful, and led to the development of many useful tools, including algorithmic probability and Kolmogorov complexity.

(MIRI technical agenda)

oh, so that’s why the solomonoff prior is the ultimate correct perfect one, because it’s … based on simplicity.  and simple hypotheses are good, “in reality” (??).  thanks,……….

nostalgebraist:

cofinaldestination:

nostalgebraist:

For those of you who like to keep up with MIRI’s level of productivity or lack thereof – I just noticed they’ve put three new papers on arXiv this year so far, two of which were submitted just a few weeks ago.  (I haven’t looked at the papers themselves yet, no idea about their quality.)

The Löb’s Theorem paper seems to be okay, in terms of whether the conclusions follow from the hypotheses. Beyond that - in terms of actual usefulness - I have less of an idea.

(There is also line I can’t seem to justify, but it could be an error on my end.)

No idea about usefulness either, but the result (if correct) seems surprising and cool to me.  (The paper we’re talking about is this)

After looking over this paper and its predecessor, I am now confused.  The new paper by Andrew Critch says that its “Parametric Bounded Löb’s Theorem” is a new result that Critch has “discovered,” which allows him to prove the cooperation result about his finite version of Fairbot, FairBot_k.

But an earlier paper from MIRI, which introduced a Fairbot that can find proofs of unbounded length, briefly mentions the concept of a “FiniteFairBot” which sounds essentially the same as Critch’s FairBot_k.  It says that cooperation results for FiniteFairBot can be proven using “the bounded version of Löb’s Theorem,” and earlier on it says that “variants of Löb’s Theorem for bounded proof lengths are well-known among logicians.”

Am I reading this wrong, or was the earlier paper just wrong about the bounded version being a known theorem?

(P. S. if you saw the previous post, I’m done working now)

(via nostalgebraist)

cofinaldestination:

nostalgebraist:

For those of you who like to keep up with MIRI’s level of productivity or lack thereof – I just noticed they’ve put three new papers on arXiv this year so far, two of which were submitted just a few weeks ago.  (I haven’t looked at the papers themselves yet, no idea about their quality.)

The Löb’s Theorem paper seems to be okay, in terms of whether the conclusions follow from the hypotheses. Beyond that - in terms of actual usefulness - I have less of an idea.

(There is also line I can’t seem to justify, but it could be an error on my end.)

No idea about usefulness either, but the result (if correct) seems surprising and cool to me.  (The paper we’re talking about is this)

(via cofinaldestination)

For those of you who like to keep up with MIRI’s level of productivity or lack thereof – I just noticed they’ve put three new papers on arXiv this year so far, two of which were submitted just a few weeks ago.  (I haven’t looked at the papers themselves yet, no idea about their quality.)

argumate:

@voximperatoris, let’s run with the AI-curing-cancer thought experiment for a bit.

Say you had some kind of system for investigating possible drugs or genetic therapies or who knows what, a search algorithm combining expert systems and deep learning ala AlphaGo or whatever. The details are not super-important.

Some poor sap shows up with cancer and you take some biopsies and put the data into the system and it gives some recommendations for therapy or possible interventions and so on.

Now perhaps it turns out that there are some clever treatments that we’ve missed or perhaps the system can recommend further investigations that yield promising results, which would be fantastic.

But at no point would you expect the system to start cannibalizing the world for more processing power to cure cancer, just as you would not expect AlphaGo to behave in this fashion, nor indeed any program that was not written specifically in order to behave in this way.

Part of the AI risk debate is predicated on the assumptions that:

IF we have incredibly intelligent algorithms

AND we use them to build up a detailed model of the world including themselves

AND we give them unfettered access to resources

AND we then set them running independently with unclear objectives

THEN there is the possibility that they will do things we regret

This then leads to a lot of hand-waving and comic book scenarios about exactly what those regretful actions might be, but that’s skipping over the good bits!

IF we have incredibly intelligent algorithms

IMO tools can present serious risks, but they probably won’t present the same risks as “agents” if we keep their training data separate from their test data.

The dangers with a tool like the cancer AI are that it may give recommendations without being able to explain its reasoning (like how AlphaGo can’t explain its moves to us).  We can always reject those recommendations, but if we don’t, we must deal with all of the consequences of the program’s flaws.

(You could argue that this is no worse than ordinary medical testing, but there’s the added risk that the program may actively seek outcomes that we consider bad, making the results worse than chance.  In the cancer case, the goal is not just “remove/damage the tumor” but “don’t damage the patient in the process,” and if the program doesn’t know everything we do about side effects, it may recommend a procedure that harms the patient an arbitrary amount if it can also harm the tumor by doing so)

The safe things about tools like that is that usually they learn from example data and not from their own actions.  If they could learn from the consequences of their own actions, they might learn rules like “outputs that make the humans give me more power/data are associated with better outcomes.”  But generally we don’t let programs do that.  Those dangers would only come up in situations where there is a reason to let the program learn from its own actions (say, if it is trying something no one has done before).