Install Theme
raattles:
“ hahahahahahahahahahaha its tiny fucking legs hahahahahahahahahahaaaaa
”

raattles:

hahahahahahahahahahaha its tiny fucking legs hahahahahahahahahahaaaaa

(via nostalgebraist)

nightpool:
“allthingslinguistic:
“ A classic table of accidental lexical gaps in English, from Language Log.
”
i think a good argument for sapir–whorf is that I cannot even imagine the concept gestured to by “candible” ”

nightpool:

allthingslinguistic:

A classic table of accidental lexical gaps in English, from Language Log.

i think a good argument for sapir–whorf is that I cannot even imagine the concept gestured to by “candible”

(via jadedviol)

birdblogwhichisforbirds:

Distinguished Hellsite Users, I present to you: nostalgebraist-autoresponder is bangin’ out the tunes: an algorithmically-generated album.

This is also a birthday present to Rob, from me.

I’m sure that nostalgebraist-autoresponder needs no introduction for most of you, but just in case she does: Frank nostalgebraist-autoresponder is a gpt-2 based bot my husband created with a lot of complex code. She creates a wide variety of posts, and some of these posts look like song lyrics or poetry.

What probably does need a little introduction is Microsoft Songsmith, an experimental research program from 2009. This extremely cringe ad explains it, but the gist is, the program will take anything you sing into the microphone and generate a song around it, giving you certain style options. The fact that this is possible at all is impressive, although the results aren’t exactly amazing (for example, see this version of Queen’s We Will Rock You.)

I don’t have any musical talent, I don’t know how to write songs and (as is probably obvious) the mic on my laptop isn’t great. BUT! With the aid of songsmith I was able to turn some of Frank’s lyrics and poetry into an album. Not a good album, but an album nonetheless. Some of the songs use public domain tunes (sung badly by me), some of them I made up as I went along. It is obviously, massively cringe. But I had fun, and I hope you do too!

Happy birthday Rob. I love you. <3

nostalgebraist:

nostalgebraist:

nostalgebraist:

Will write something up about this later, but here’s something I made today:

logit lens on gpt-neo

This extends my old “logit lens” work to GPT-Neo. Turns out it … doesn’t exhibit the “logit lens” phenomenon at all????

Updated the notebook to add a plot for CTRL, another non-GPT transformer LM.

CTRL does display the “logit lens” phenomenon.

Unlike gpt2, it not only “looks like the output” in late layers, it also “looks like the input” in early layers.

Updated the notebook with many extensions, including a (partial?) solution to the difficult I originally had interpreting the GPT-Neo results.

It’s got pretty pictures, look!

image
image
image
image

nostalgebraist:

nostalgebraist:

Will write something up about this later, but here’s something I made today:

logit lens on gpt-neo

This extends my old “logit lens” work to GPT-Neo. Turns out it … doesn’t exhibit the “logit lens” phenomenon at all????

Updated the notebook to add a plot for CTRL, another non-GPT transformer LM.

CTRL does display the “logit lens” phenomenon.

Unlike gpt2, it not only “looks like the output” in late layers, it also “looks like the input” in early layers.

Updated the notebook with many extensions, including a (partial?) solution to the difficult I originally had interpreting the GPT-Neo results.

interactive notebook with frank’s generator model

I recently uploaded Frank’s generator model to the Huggingface content delivery network.

This let me create a Colab notebook where you can write text using the model.

Check it out you’re interested in seeing more about Frank’s inner workings!

(Or if you’re familiar with pytorch / ML and want to use the model in your own projects)

Somehow the pagination links (older, newer, page numbers) disappeared on my tumblr page at https://nostalgebraist.tumblr.com/

Looks like the website made them disappear because on some level it believes I have infinite scrolling turned on… but it didn’t go all the way and turn on infinite scrolling, so there was just no way to move through posts at all

I should file it as a bug (EDIT: done), but for now, posting this as a heads up in case it happens to you.

I fixed it for now by editing my theme to comment out the block that makes pagination invisible:

/*{block:IfInfiniteScroll}*/

/*#infscr-loading, .pagination {*/

/* display: none !important;*/

/*}*/

/*{/block:IfInfiniteScroll}*/

nostalgebraist:

Will write something up about this later, but here’s something I made today:

logit lens on gpt-neo

This extends my old “logit lens” work to GPT-Neo. Turns out it … doesn’t exhibit the “logit lens” phenomenon at all????

Updated the notebook to add a plot for CTRL, another non-GPT transformer LM.

CTRL does display the “logit lens” phenomenon.

Unlike gpt2, it not only “looks like the output” in late layers, it also “looks like the input” in early layers.

a-point-in-tumblspace:

nostalgebraist:

Will write something up about this later, but here’s something I made today:

logit lens on gpt-neo

This extends my old “logit lens” work to GPT-Neo. Turns out it … doesn’t exhibit the “logit lens” phenomenon at all????

This is distressing. I’m distressed.

According to my understanding – no, screw my understanding, according to GPT-Neo’s source code – each decoder unit has a residual identity connection, so it outputs “x + F(x)” for some big complicated F, which is helpful because “the identity function is hard for NNs to learn” or whatever. And then you can view the stack of decoders as computing “x + F1(x) + F2(x) + F3(x) + …”, making a series of incremental refinements to the input to ~continuously transform it into the output.

And viewed that way, it almost can’t help but produce nice smooth gradients on your logit-lens plots.

And yet, the 125M GPT-Neo appears to just produce random outputs on the intermediate layers before jumping straight to a reasonable guess on the last layer.

So… either your lens code doesn’t play correctly with GPT-Neo, or my “incremental refinements” understanding is nonsense (worse, useless nonsense). Sound about right?

The result surprised me too, but your statement here is too strong IMO:

And viewed that way, it almost can’t help but produce nice smooth gradients on your logit-lens plots.

As I noted in the original LW post, GPT-2 itself isn’t smooth and gradual everywhere. It makes a huge jump right after the input and changes gradually thereafter.

(In later work – which I should clean up and share sometime – I learned that this jump occurs specifically in the MLP sub-block of the first layer. So it happens in the 2nd thing the network does to the input, rather than the very 1st)

GPT-Neo has the same large jump after the input, since early layers don’t look like the input. The difference from GPT-2 is that it also has another large jump near the end.

Will write something up about this later, but here’s something I made today:

logit lens on gpt-neo

This extends my old “logit lens” work to GPT-Neo. Turns out it … doesn’t exhibit the “logit lens” phenomenon at all????