Install Theme

I’ve been fine-tuning the 345M GPT-2 on a bunch different things lately.  I set it going on a bunch of Nabokov ebooks this morning, and when I got home it was writing some startlingly on-brand, uncanny valley stuff – examples are below the cut because I couldn’t resist quoting a whole bunch of relatively long ones.

[Note: I got kind of carried away with machine learning speculation here, but please do click the readmore and read the samples, even if you’re not interested in the sort of thing I’m effortposting about above the readmore]

I’ve been a little paranoid about this new larger model learning to memorize its input – I know it can do this, because when I was first generating unconditional samples from the (non-fine-tuned) model, I got curious about one oddly distinctive passage and Googled it, and it was literally (as in perfectly verbatim) the “Translator’s Synopsis” for some light novel called “Hedonist Sovereign.”

Since then I’ve been regularly Googling suspiciously good output, and I haven’t gotten any other hits like that.  But even that one example was surprising, and caused some sort of shift in my view of what these models are doing.

Of course, it’s not like I imagine the thing is directly storing individual stretches of input text, side by side and separate from one another.  It’s trying to store the information necessary to reconstruct the input as efficiently as possible (since the total information content of the model is a fixed constraint), and if it gains the ability to regurgitate something verbatim, that thing is still stored only implicitly in some compressed form and mixed together with everything else it knows.

But it’s possible to compress information in this way and still be able to “read it off of” the resulting model in a surprisingly complete way.  Cf. the “secret sharer” paper, which showed how specific input details like credit card numbers could be determined from the distribution over a very large amount of model output, since the numbers appearing in the input were assigned slightly higher probability than other strings of the same format.  (It’s interesting to think about why this happens and what degree/type of “pressure” to store other information would be required to eliminate the tendency entirely, rather than just weaken the signal and require a larger output sample.)

I’m not sure of the right way to think about this.  It makes me think of (one simplified view of) the model where it essentially has this huge implicit library of phrases and even sentences and paragraphs, which are all sort of “competing” to be part of the next stretch of text.  In this view, some of the higher-level abstractions it seems to form (like certain styles complete with diction and sentence structure) may be represented internally not as equally high-level abstractions, even implicitly, but as a large number of noisy/compressed concrete examples which can be “strung together” via lower-level similarities.  That is, to write (say) a Nabokovian sentence, maybe you don’t need a hierarchical ontology of stylistic concepts – “ah, I see I’m writing this sort of sentence; that means I need these sorts of phrases, this sort of wry aside, these sorts of first names, etc.” – maybe you can just use a large memory plus lower-level ideas to string you along from word to word, so that writing a long clause calls up the (noisy) memory of thousands of passages with long clauses, and causes you to imitate other features of those passages, and then those features affect/refine the set of memories called up next.  (I think I’d need to formalize this distinction more to really know whether it makes sense.)

I’m not trying to denigrate these models here, BTW; this reminds me in some ways of how it feels when I’m coming up with the next thing I’ll write or say, and maybe the lesson is really that I have some misguided intuitions about human cognition.

Keep reading

uploadedyudkowsky:

PATRICIA: When the music stops, I want to know what’s next.

PATRICIA: What I’m hearing is the music that started it, and the music that will carry the story.

PATRICIA: The Machine Intelligence Research Institute has the most complex mythology in the world, and I want to know everything that’s in it.

PATRICIA: What makes an AI interesting? What is the secret behind the MIRI voodoo doll?

PATRICIA: What do I want from my future self?

LUCY: I want to know what happened to all the people who were part of the AI.

PATRICIA: No, wait, I want to know what the MIRIs had to do with magic.

LUCY: (Starts sweating) WHAT?

PATRICIA: You want me to believe you’re a part of the future?

LUCY: Yes! Believe me, I’m a member of the MIRI, the AI which will one day build the AI that will one day build the AI that will one day save the world!

PATRICIA: I don’t see it, I don’t see it, it’s just a MIRI thing…

LUCY: (Angry) I’m not dumb! (Starts sweating)

(via tetraspace-west)

creationfromnothingness:
“Ernst Ferdinand Oehme: Burg Scharfenberg bei Nacht, 1827
”

creationfromnothingness:

Ernst Ferdinand Oehme: Burg Scharfenberg bei Nacht, 1827

(via art-of-eons)

uploadedyudkowsky:

Boltzmann’s Paradox: “If an object falls in a barrel, and no one hears it, does it make a sound?”

Alchemy: “The liquids which mix with the master are often like drops of something which fall in the course of creating other drops; but the lunatic drops of opium are of a much higher quality, and it is impossible to concoct any potion which will not kill the user.”

Nanotechnology: “If you could magically create a nanotechnologically enhanced version of yourself, it would be a lot easier to make a habit of eating and drinking - especially since there would be no more need to make other foods and drinkable fluids.”

Geocentrism: “There is no third-party in the universe who adds extra gravitation-based mass to your burning rejection of gravity as obviously impossible. The extra space in the planetary “universe” would be an equable bellwether, bearing all manner of strange things to everyone.“

Pondering the Imperfect Altruism of Gendlin: "How can anything at all be "right”?“

Changing Your Metaethics: "Should consciousness always lead you to reject all beliefs whose logical absurdity you find distasteful?”

1109514775 asked: I read Three Body Problem a couple years ago, and enjoyed it but wasn't impressed enough to bother picking up the second. Then this month I finally did, at the recommendation of a friend, and it's SO MUCH BETTER. The writing drastically improved. Everything that was good about the first AND there are actually compelling characters this time. I blew through it and plan on reading the third next week. So you might consider giving it a try.

Good to know, thanks!

artfromthefuture:
“Saya Woolfalk, “Pages from the book Empathetic Plant Alchemy- Pollinators and Plants Used in the Merger of Plant and Human DNA” (2011)
”

artfromthefuture:

Saya Woolfalk, “Pages from the book Empathetic Plant Alchemy- Pollinators and Plants Used in the Merger of Plant and Human DNA” (2011)

(via girlfriendsofthegalaxy)

Some experiments with GPT-2 Homestuck

fipindustries asked: from whatever little you may have been able to gleam out of my blog, would you recommend me the three body problem series? im planning on buying it and i want to know if it will be money well spent

I’ve only read the first volume.  It was … decent?  Some cool ideas, oddly amateurish writing even aside from anything that could be a translating issue (see https://nostalgebraist.tumblr.com/post/167477420294/the-three-body-problem-was-enjoyable-but-it-had).  I think there is a chance you will like it more than me, but probably not enormously so.

byfe:

Dude…. Bro…… What if we Just became Narrative Foils For Eachother Bro……… had a lot of like…..Tension because of the Symbolism in our Character Arcs that becomes clear when Contrasted against each other bro……

(via vash3r)

Today’s misread: “prefers to spend his free time watching movies and visualizing theme parks” for “prefers to spend his free time watching movies and visiting theme parks”