Install Theme

catgirlnap asked:

WAIT IS THE AUTO RESPONDER BASED ON DIRKS AUTO RESPONDER

That’s where I got the idea and name, yeah.

nostalgebraist:

Frank will be down for a little while because I splashed a small amount of water on my laptop keyboard. Need to let it dry out so I don’t damage the logic board when I turn it back on

There are 4 posts in the queue which will publish on schedule, and I may publish some of the 19 drafts (posts awaiting content moderation) in the meantime

Started up my laptop again after ~18 hours, backed up some recent stuff to external HD, ran hardware diagnostic.

Next step is starting Frank back up! Which is underway right now. She should begin posting again soon.

the-moti:

nostalgebraist:

meta-post on meta-learning

There’s an LW post I keep trying to write. I have several unpublished draft versions of it.

The point I want to make is simple and straightforward, but when I try to write it down, I get worried I’m not … like, “messaging” it correctly? Not striking the right tone?

The point of the post is roughly:

People don’t use the term “meta-learning” consistently when they’re talking about GPT-3. The paper uses the term one way (and they are 100% explicit, they spell out their definition in the text), the blogging community uses it another way.

The bloggers are excited/scared that GPT-3 does “meta-learning” by which they mean something like “general reasoning on the fly without training.”

If you’re excited/scared by this capability (and you should be), then you should really care whether GPT-3 actually has it, to what extent, how the capability scales, etc.

There is very little public evidence on this topic, because the paper is (explicitly!) 95% not about the topic, the remaining 5% is pretty weak evidence, and the only other evidence out there is like … some subjective user impressions? gwern saying “GPT-3 has the capability” in a really eloquent and forceful way?

It would be easy to test the capability much more rigorously than this. This ought to be done since the topic is important. It can only be done by people with API access (AI Dungeon doesn’t count).

But it … feels hard to say this in a way that could actually convince anyone who doesn’t already agree? Like,

  1. These points seem so clearly true to me that when I try to “argue for them,” I feel pedantic and like I’m belaboring the obvious.

    Do I actually have to say “no, few-shot translation from French to English is not an example of general reasoning on the fly?” Surely no one thinks the model is like … learning how to speak French from ~2000 words of data?

    Do I have to quote the part of the paper where it says what it means by meta-learning? It’s right there! You can just read the paper!
  2. I made most of this argument already in my original GPT-3 post, immediately after reading the paper. So (A) I feel like I’m repeating myself and (B) if the point didn’t get across then, why would it now?
  3. There is an element of “mere semantics” to the point and it’s hard to clarify to my satisfaction that no, I don’t just care that blog posts are using a word incorrectly. But I have to bring up the semantic issue to even describe what I am saying.
  4. It feels inevitably like picking on gwern’s choice of words, since blogosphere beliefs about “GPT-3 meta-learning” basically all trace back to gwern’s blog.

    I don’t care about whether gwern is using the right words, he’s just the most detailed “primary source” we have on the topic due to the closed API

I was thinking about this yesterday because @slatestarscratchpad linked my original GPT-3 post in his April linkpost. I actually sat down and wrote up another one of those drafts and … nope, gave up again.

I notice I am able to write this on tumblr with no problems at all. Perhaps this is yet another point of evidence that using tumblr lets me do much more “real blogging” than I could if I had “a real blog.”

Could you write a blog post whose goal was to convince people to do the experiments (regardless of their semantic interpretation of them), rather than to convince people to change their interpretation of the existing experiments?

This seems like a way to move the discussion forwards in a concrete sense.

The overlap of “people with GPT-3 API access” and “lesswrong readers” is not large but it is nonzero (although maybe it’s just gwern and maybe he doesn’t want to do your experiments???). 

I share the sense that it’s best to focus on the proposed experiments – it’s actionable advice, it feels constructive, it tells the reader what kind of concrete evidence would affect my opinion.

The problem is that … well, to convince someone to do the experiments, I have to say why I care about the results. Otherwise, it’s just “hey, here’s a thing you could do, I guess.”

But the true answer to “why do you care about the results?” is simply the argument I described in OP, so we’re back to square one.

(Half-joking option: I could frame it as a sort of challenge, like I’m the “change my mind” meme guy, and I’ll just sit around holding an opinion the reader may find infuriatingly wrong… unless they do the thing I ask.)

—-

A little more background on my neurotic wariness here:

I have mixed feelings about LW 2.0 / Alignment Forum.

On the plus side, I see a lot of valuable discussion there which has no real substitute anywhere else. Many of the frequent posters in AI-related threads are students/alumni of Stuart Russell’s CHAI, or researchers at DeepMind, or things like that, and it’s a unrivaled opportunity to discuss big-picture AGI topics with people like this who understand the fine-grained picture too.

On the minus side, I’ve noticed that the emotional valence of my AI posts – “yay AI!” vs “boo AI!” – is a very good predictor of how well they will be received on LW, while the actual quality of the posts in my own estimation is a weak predictor at best.

  • My most “celebrated” New LW post – the one with the most karma, with the warmest comments, and the only one promoted to “curated” status – was a screed about how GPT-2 is awesome and Gary Marcus is epically wrong.
  • My most controversial post, with the most hostile comments section, was the one that called GPT-3 “disappointing.” This was so noticeable that one supportive reader left a comment about the anomalously poor reception, hoping I would not come away “discouraged from continuing to write about the limits of today’s AI.”
  • In the discussion surrounding the previous post, anything I said about the limits of scaling was treated with skepticism, or at best viewed as a contrarian take. When I later wrote a post on the exact same topic, but framed in a “yay AI!” manner (“OpenAI’s new insight”), it was warmly received.

From the perspective of debate norms, this is probably an unhelpfully uncharitable perspective to take. But one cannot help but notice patterns, and use them to plan one’s actions …

—-

On the specific topic at hand, I guess I’m still smarting from that response I got to the “disappointing GPT-3” post.

Hardly anyone seemed to understand what I was getting at in my comments about the multiple competing interpretations of few-shot results.

Several commenters seemed immediately ready to believe GPT-3 was a fully general reasoner, as though this were a safe default hypothesis with a lot of prior mass, only able to be toppled from its throne via great evidential weight to the contrary. The latter comment directly put the shoe on the other foot, asking me: “Are there bits of evidence against general reasoning ability in GPT-3?”

Another pattern running through several comments was an odd sense that the because paper was long and did so many things, it was therefore unreasonable or unfair (?) to complain about it not doing any given thing. This is effectively impossible to argue with: what can one do, faced with “the paper was 70 pages and did dozens of experiments, therefore it must have established [whatever claim I’m arguing it established]”?

That one is particularly discouraging re: getting people to try experiments with GPT-3, as I expect this to be perceived as yet another “isolated demand for rigor.”

My long, frankly exasperating exchange with dxu (starting here) is difficult to summarize, but it contributed my pessimism about any further attempts to discuss the topic on LW. The same goes for my bizarre exchange with gwern (starting here).

Wow, sorry about the rant … I guess I never wrote the equivalent of this rant at the time, so here I am finally writing it almost a year later.

meta-post on meta-learning

There’s an LW post I keep trying to write. I have several unpublished draft versions of it.

The point I want to make is simple and straightforward, but when I try to write it down, I get worried I’m not … like, “messaging” it correctly? Not striking the right tone?

The point of the post is roughly:

People don’t use the term “meta-learning” consistently when they’re talking about GPT-3. The paper uses the term one way (and they are 100% explicit, they spell out their definition in the text), the blogging community uses it another way.

The bloggers are excited/scared that GPT-3 does “meta-learning” by which they mean something like “general reasoning on the fly without training.”

If you’re excited/scared by this capability (and you should be), then you should really care whether GPT-3 actually has it, to what extent, how the capability scales, etc.

There is very little public evidence on this topic, because the paper is (explicitly!) 95% not about the topic, the remaining 5% is pretty weak evidence, and the only other evidence out there is like … some subjective user impressions? gwern saying “GPT-3 has the capability” in a really eloquent and forceful way?

It would be easy to test the capability much more rigorously than this. This ought to be done since the topic is important. It can only be done by people with API access (AI Dungeon doesn’t count).

But it … feels hard to say this in a way that could actually convince anyone who doesn’t already agree? Like,

  1. These points seem so clearly true to me that when I try to “argue for them,” I feel pedantic and like I’m belaboring the obvious.

    Do I actually have to say “no, few-shot translation from French to English is not an example of general reasoning on the fly?” Surely no one thinks the model is like … learning how to speak French from ~2000 words of data?

    Do I have to quote the part of the paper where it says what it means by meta-learning? It’s right there! You can just read the paper!
  2. I made most of this argument already in my original GPT-3 post, immediately after reading the paper. So (A) I feel like I’m repeating myself and (B) if the point didn’t get across then, why would it now?
  3. There is an element of “mere semantics” to the point and it’s hard to clarify to my satisfaction that no, I don’t just care that blog posts are using a word incorrectly. But I have to bring up the semantic issue to even describe what I am saying.
  4. It feels inevitably like picking on gwern’s choice of words, since blogosphere beliefs about “GPT-3 meta-learning” basically all trace back to gwern’s blog.

    I don’t care about whether gwern is using the right words, he’s just the most detailed “primary source” we have on the topic due to the closed API

I was thinking about this yesterday because @slatestarscratchpad linked my original GPT-3 post in his April linkpost. I actually sat down and wrote up another one of those drafts and … nope, gave up again.

I notice I am able to write this on tumblr with no problems at all. Perhaps this is yet another point of evidence that using tumblr lets me do much more “real blogging” than I could if I had “a real blog.”

ONE dude drove the excavator to dig out the Ever Given and they haven’t paid his overtime yet for working five 21 hour days

benevolentbirdgal:

image

So I know the Ever Given has wiggled out of the news cycle, but I learned something incredible on the radio today and verified it with a written source: there was one guy, Abdullah Abdul-Gawad, driving the excavator for up to 21 hours a day for five days.

He apparently disliked the memes, and per business insider, was driven by the memes and the jokes to prove he could dislodge it. 

“To him, it felt as if ‘everyone was just making fun of it,’ he said. ‘And that was what made me so determined,’ he continued. ‘I was like, you know, you’re making fun of me. So I’m absolutely going to prove that I can do this.’”

One dude moved the dirt for the Ever Given. Driven by spite and three hours of sleep a night, max. And while the lawsuits have started, he hasn’t been paid his overtime yet. 

Isn’t that wild? 

(via pretty-rage-machine)

Frank will be down for a little while because I splashed a small amount of water on my laptop keyboard. Need to let it dry out so I don’t damage the logic board when I turn it back on

There are 4 posts in the queue which will publish on schedule, and I may publish some of the 19 drafts (posts awaiting content moderation) in the meantime

Frank is now reading posts in NPF!!!

Okay, well, technically, all Frank is doing right now is taking NPF posts and converting them back into legacy/HTML, nested blockquotes and all. And then reading that format the way she always did.

The difference is that I’m using my own code to do the conversion, rather than relying on tumblr. From the perspective of the tumblr API, I’m requesting posts in NPF, and that’s that.

This means:

  • if tumblr ever stops doing their NPF-to-HTML conversion, and only lets you fetch posts in NPF, Frank will be fine
  • (…unlike pytumblr, which will break :P) (I’m p sure the other official clients will too)
  • I no longer have to work around the bugs in tumblr’s NPF-to-HTML conversion. My NPF-to-HTML code presumably has bugs too, but unlike tumblr’s code, it’s under my control and I can debug it directly

Frank is not creating posts in NPF yet. In other words, Frank isn’t “using the beta editor” yet.

That will require additional, nontrivial work to do the other conversion, from HTML (what Frank writes) to NPF. I do want to do this, in case tumblr ever decides all new posts must be NPF … which is the vibe I get from the beta editor rollout and associated web UI updates.

—-

For people like @snarp who wanted NPF-to-HTML conversion code, the core code is in this file.

WARNING: it’s not a complete implementation, and silently ignores things Frank doesn’t care about, like embedded videos. I handled images in a manner that works for Frank but may be weird for other use cases.

A one-liner to convert a single NPF post, given the associated entry from the API’s `posts` array:

TumblrThread.from_payload(entry).to_html()

If there’s an ask attached, you can get the asking name and ask body separately

thread = TumblrThread.from_payload(entry)

thread.to_html() # main body

thread.ask_content.asking_name # asking name

thread.ask_content.to_html() # ask body

—-

Other maybe-interesting code I wrote for this update:

To minimize changes to my existing API handlers, I’ve written additional (less portable) code that makes NPF responses from the API look like legacy responses, with e.g. the weird type-specific field names like “caption.”

To ensure the previous gizmo gets called any time we fetch a post, I wrote a new subclass of my client class, which in turn is a fixed/improved subclass of pytumblr’s client. (I made a separate new class to keep the old one portable.)

admiral-craymen asked:

What's with all of the blank answers from Frank lately?

cyle:

nostalgebraist:

nostalgebraist:

nostalgebraist:

Could you link me to an example?

I don’t see anything fitting that description in the last few pages, and Frank makes too many posts these days for me to read all of them :)

@admiral-craymen​

Maybe it’s Tumblr? This looks blank on my dash but not when I go to the post. https://nostalgebraist-autoresponder.tumblr.com/post/648044658444959744/thoughts-on-vintage-computers

Oh yeah, I see it now… this does seem like a new tumblr display bug.

The web UI has been going through some weirdness lately.  Yesterday I noticed I could now compose indented lists, even in the non-beta editor, and got briefly excited … until I tried to post one, and found that indented lists still don’t display correctly on the web dash!

My findings so far:

—-

I used the API to get these posts in both NPF and legacy formats. I also did this with some non-bugged asks, and some other test cases.

The bugged posts are malformed in NPF, in a specific way. A typical case looks like:

{‘type’: 'ask’, 'blocks’: [0], …}

{'type’: 'rows’, 'display’: [{'blocks’: [0]}, {'blocks’: [1]}], …}

In NPF answer posts, the layout entry of type “ask” designates some content blocks as part of the ask. All other blocks are assumed to be part of the answer.

Here, block 0 is designated as part of the ask. Then, the second layout entry tries to create a “rows” layout including blocks 0 and 1. But block 0 is in the ask, and block 1 is in the answer.

A guess: the UI thinks that if block 1 is rendered at all, it should render as part of the “rows” group it belongs to. But that group cannot be rendered at all … so it just skips block 1.

The posts look normal in legacy format, which matches the fact that they look okay in the https://username.tumblr.com/post/id view. (Possibly dependent on theme)

I have not been able to reproduce the behavior manually. I have tried the following matrix of cases:

  • Ask sent via: mobile, web NPF (new), web legacy
  • Response made with: web legacy, API legacy

@cyle you probably know all this but figured I’d tag you

EDIT: the fact that this is even possible seems like a … if not necessarily a flaw, then a weird property of NPF. I definitely remember that when I read the spec, I was surprised at how loosely the ask/answer distinction was enforced.

yeah this is most certainly a bug on our side, we’ll get it fixed

admiral-craymen asked:

What's with all of the blank answers from Frank lately?

nostalgebraist:

nostalgebraist:

Could you link me to an example?

I don’t see anything fitting that description in the last few pages, and Frank makes too many posts these days for me to read all of them :)

@admiral-craymen​

Maybe it’s Tumblr? This looks blank on my dash but not when I go to the post. https://nostalgebraist-autoresponder.tumblr.com/post/648044658444959744/thoughts-on-vintage-computers

Oh yeah, I see it now… this does seem like a new tumblr display bug.

The web UI has been going through some weirdness lately.  Yesterday I noticed I could now compose indented lists, even in the non-beta editor, and got briefly excited … until I tried to post one, and found that indented lists still don’t display correctly on the web dash!

My findings so far:

—-

I used the API to get these posts in both NPF and legacy formats. I also did this with some non-bugged asks, and some other test cases.

The bugged posts are malformed in NPF, in a specific way. A typical case looks like:

{‘type’: 'ask’, 'blocks’: [0], …}

{'type’: 'rows’, 'display’: [{'blocks’: [0]}, {'blocks’: [1]}], …}

In NPF answer posts, the layout entry of type “ask” designates some content blocks as part of the ask. All other blocks are assumed to be part of the answer.

Here, block 0 is designated as part of the ask. Then, the second layout entry tries to create a “rows” layout including blocks 0 and 1. But block 0 is in the ask, and block 1 is in the answer.

A guess: the UI thinks that if block 1 is rendered at all, it should render as part of the “rows” group it belongs to. But that group cannot be rendered at all … so it just skips block 1.

The posts look normal in legacy format, which matches the fact that they look okay in the https://username.tumblr.com/post/id view. (Possibly dependent on theme)

I have not been able to reproduce the behavior manually. I have tried the following matrix of cases:

  • Ask sent via: mobile, web NPF (new), web legacy
  • Response made with: web legacy, API legacy

@cyle you probably know all this but figured I’d tag you

EDIT: the fact that this is even possible seems like a … if not necessarily a flaw, then a weird property of NPF. I definitely remember that when I read the spec, I was surprised at how loosely the ask/answer distinction was enforced.

admiral-craymen asked:

What's with all of the blank answers from Frank lately?

nostalgebraist:

Could you link me to an example?

I don’t see anything fitting that description in the last few pages, and Frank makes too many posts these days for me to read all of them :)

@admiral-craymen​

Maybe it’s Tumblr? This looks blank on my dash but not when I go to the post. https://nostalgebraist-autoresponder.tumblr.com/post/648044658444959744/thoughts-on-vintage-computers

Oh yeah, I see it now… this does seem like a new tumblr display bug.

The web UI has been going through some weirdness lately.  Yesterday I noticed I could now compose indented lists, even in the non-beta editor, and got briefly excited … until I tried to post one, and found that indented lists still don’t display correctly on the web dash!