Install Theme

Frank is down for the moment because I am staying at someone else’s house for the weekend, and I need to do some network stuff to get her set up over here

bayesic-bitch asked:

I just saw Frank respond to the text in an image, and I was curious how that happened. Do you have some vision/ joint image-text thing tied to GPT-J? Or was she just reading off something like a filename or hidden text description that included that information?

The former.

Whenever Frank “reads” a post, any images in the post are sent to the AWS Rekognition text detection API.

For animated GIFs, I take evenly spaced stills from the GIF (at 1 fps or 10 stills totals, whichever is more restrictive) and sent them all through that API.

You can see my (bad) code for this stuff here.

nostalgebraist:

Frank will be down for a little while (~24 hours?)

For a complex, yet extremely boring, reason involving multiple old laptops with battery issues and not enough power cords for them all etc. etc.

Please don’t spam Frank with lots of asks, reblogs or replies during downtime

Queued textposts will still publish

I’m bringing Frank back online now, a little ahead of schedule. (Underpromise and overdeliver…)

Frank will be down for a little while (~24 hours?)

For a complex, yet extremely boring, reason involving multiple old laptops with battery issues and not enough power cords for them all etc. etc.

Please don’t spam Frank with lots of asks, reblogs or replies during downtime

Queued textposts will still publish

I deployed a new version of Frank’s generator model today.

This one is still GPT-J, and is generally similar to the previous one.

However, I’ve worked out a lot of the kinks in my GPT-J fine-tuning code/process since I did it the first time.

For example, the first model did not train at the intended learning rate schedule due to a bug, and its learning rate was overall much lower than what I wanted.

Click here to see a report with much more info, including loss plots over the course of training.

—-

I don’t know if this model is much different qualitatively from the last one. Output feels broadly similar to me.

However, it does achieve much better validation loss than the previous one: 1.73 vs the old 1.91. That’s similar in size to the gains I got from the original move to GPT-J.

But, I’m not entirely sure how much I trust gains on my validation data to translate to qualitative improvements. There’s a tradeoff between achieving low loss on my tumblr data and retaining performance on the much more general pre-training dataset, or other generic capabilities.

To quantify the tradeoff, it’d be cool to check how fine-tuning affects the few-shot benchmarks using EleutherAI’s eval harness… the codebase is set up to do this during pre-training, but it will take some work to do the same thing during fine-tuning.

florescent--luminescence-deacti asked:

Was Frank trained on any languages other than English?

Frank’s generator is a fine-tuned GPT-J.

The fine-tuning corpus was all English. The much larger pre-training corpus was The Pile, which is almost all English but contains a non-trivial amount of non-English data.

See the Pile paper for details.

If you spam Frank with a lot of low-effort asks/reblogs/replies in a short time, I will probably put you on a list of rate-limited users until you slow down.

This makes Frank respond to you much less often than normal, which is presumably the opposite of what you want.

When I get around to it, I’ll have Frank do this herself automatically.

julionasurbonas asked:

have you noticed that frank is starting to link wikipedia pages that actually exist now

She has been able to do this for a long time, as the Wikipedia URL format is very predictable.

The new model might have increased the success rate here. Not sure, though.

nostalgebraist:

nostalgebraist:

nostalgebraist:

FYI:

Frank is getting an unusually large quantity of asks and other responses tonight. I don’t think I’ve ever seen Frank’s inbox this busy since I turned anon off.

A backlog of asks/etc. has built up, and they keep coming in.

Response times are abnormally slow because of this.

image

Tonight, Frank continues to be atypically popular

She also has 17 unanswered asks in her inbox.

The new model is slower, which explains some of the backlog, but I think there’s also been much more demand than usual lately. Did one of her posts go viral or something?

Now, a day later, the surge appears to be over.

nostalgebraist:

nostalgebraist:

FYI:

Frank is getting an unusually large quantity of asks and other responses tonight. I don’t think I’ve ever seen Frank’s inbox this busy since I turned anon off.

A backlog of asks/etc. has built up, and they keep coming in.

Response times are abnormally slow because of this.

image

Tonight, Frank continues to be atypically popular

She also has 17 unanswered asks in her inbox.

The new model is slower, which explains some of the backlog, but I think there’s also been much more demand than usual lately. Did one of her posts go viral or something?