Is there any underlying reason behind the longtime tumblr blog name trend of pluralizing a noun that’s usually or always singular, like “belgiums” or somethng? Is it just a purely arbitrary meme, or am I missing something?

Is there any underlying reason behind the longtime tumblr blog name trend of pluralizing a noun that’s usually or always singular, like “belgiums” or somethng? Is it just a purely arbitrary meme, or am I missing something?
preamble
Sometimes I wonder what the MIRI-type crowd thinks about some issue related to their interests. So I go to alignmentforum.org, and quickly get in over my head, lost in a labyrinth of issues I only half understand.
I can never tell whether they’ve never thought about the things I’m thinking about, or whether they sped past them years ago. They do seem very smart, that’s for sure.
But if they have terms for what I’m thinking of, I lack the ability to find those terms among the twists of their mirrored hallways. So I go to tumblr.com, and just start typing.
parable (1/3)
You’re an “agent” trying to take good actions over time in a physical environment under resource constraints. You know, the usual.
You currently spend a lot of resources doing a particular computation involved in your decision procedure. Your best known algorithm for it is O(N^n) for some n.
You’ve worked on the design of decision algorithms before, and you think this could perhaps be improved. But to find it, you’d have to shift resources some away from running the algorithm for a time, putting them into decision algorithm design instead.
You do this. Almost immediately, you discover an O(N^(n-1)) algorithm. Given the large N you face, this will dramatically improve all your future decisions.
Clearly (…“clearly”?), the choice to invest more in algorithm design was a good one.
Could you have anticipated this beforehand? Could you have acted on that knowledge?
parable (2/3)
Oh, you’re so very clever! By now you’ve realized you need, above and beyond your regular decision procedure to guide your actions in the outside world, a “meta-decision-procedure” to guide your own decision-procedure-improvement efforts.
Your meta-decision-procedure does require its own resource overhead, but in exchange it tells you when and where to spend resources on R&D. All your algorithms are faster now. Your decisions are better, their guiding approximations less lossy.
All this, from a meta-decision-procedure that’s only a first draft. You frown over the resource overhead it charges, and wonder whether it could be improved.
You try shifting some resources away from “regular decision procedure design” into “meta-decision-procedure-design.” Almost immediately, you come up with a faster and better procedure.
Could you have anticipated this beforehand? Could you have acted on that knowledge?
parable (3/3)
Oh, you’re so very clever! By now you’ve realized you need, above and beyond your meta-meta-meta-decision-procedure, a “meta-meta-meta-meta-decision-procedure” to guide your meta-meta-meta-decision-procedure-improvement efforts.
Way down on the object level, you have not moved for a very long time, except to occasionally update your meta-meta-meta-meta-rationality blog.
Way down on the object level, a dumb and fast predator eats you.
Could you have anticipated this beforehand? Could you have acted on that knowledge?
Oh hey, you were interested in reading about a bunch of people in the real-world community who seem like real good candidates for role-model material? Let me introduce you to Wes
#the smell of corned beef and spules #life is indistinguishable from satire
This was actually meant to be a reblog to main, apologies for the false alarm re: auto-responder’s reblog capabilities
…speaking of which, though, it totally should have reblog capabilities, brb
Verbal brain noise: (excitedly) “think about the mom.com implications here!”
Fire Walk With Me was Twin Peaks’ missing head, and perhaps the cinemagoers of 1992 weren’t quite prepared to find it in the fridge, beside the fruit juice.
To see both points: suppose I’m choosing between an avocado sandwich and a hummus sandwich, and my prior was that I prefer avocado, but I’ve since tasted them both and gotten evidence that I prefer hummus. The choice that does best in terms of expected utility with respect to my prior for the decision problem under consideration is the avocado sandwich (and FDT, as I understood it in the paper, would agree). But, uncontroversially, I should choose the hummus sandwich, because I prefer hummus to avocado.
Thanks!
(Until looking over my tag just now, I had this impression I had said almost this exact thing before, but it looks like I’d only done so much more implicitly and reservedly than I remembered, and anyway not recently, so … )
I really get a lot of value out of it when other people read Almost Nowhere and say things about it. I would be really happy if more people did this.
—————
I’m pretty nervous about feeling like I’m begging for attention and validation here, cf. the way I started off this post with the parenthetical above and even now am derailing it anew with this sentence.
In particular, I have this intuition that “when I do things that people actually like, they become self-advertising” – I didn’t have to write posts like this about TNC, or for that matter about my own nonfiction effortposts here, and if I want a comparable level of interest in AN then (the line of thinking goes) I should just keep writing and making it as good as possible, and “if I build it they will come.”
However, AN is really in a somewhat different situation than those other things. It is a relatively long story – I can imagine it being 2x the wordcount of Floornight by the end – that I am creating very slowly over a number of years, with more care and deliberateness than I’ve applied in the past.
I feel confident that I will complete the whole thing within (to set a goofy upper bound) the next ten years, but I expect it to take at least another 1-2 years, possibly more. I know it’s hard to get people interested in a WIP, or in very piecemeal occasional updates that don’t build an exciting sense of momentum. I know people want to read complete things, and “read it when it’s done” might still be the best option even though it means you’ll be reading it in (could well be) 5 years when you and I and the world are five years older and god only knows what’s happened in the interim. Just, that’s what that option looks like.
—————
And I realize that was an uninviting downer of an advertisement inasmuch as it was an advertisement at all, so here’s another one.
The reason I make posts like this is that I’m extremely proud of Almost Nowhere. Like, distinctly prouder of it than any other creative or quasi-creative thing I’ve ever made, as far as I can tell.
I can’t say it’s strictly better than my previous novels, since they’re all doing different things and can’t be usefully compared like substitutes for one another. But when I re-read the earlier novels, there are parts I like and parts I don’t, there are things I cringe at, places where I think “ugh, I took the easy road” or “oh, I feel bad about this chapter, should I skip it?”
Yet when I re-read AN, as I do every so often, I just feel this sense of pure glee over the whole thing, even parts I wrote 2 or 3 years ago: I like each chapter individually, I like every character and plot thread and theme and verbal motif, I like virtually every sentence. It feels like what I imagine an actor or animator might feel watching their own demo reel, curated to string together only the peaks of their output without anything else. I’m inordinately pleased with what I’ve done here. (Admittedly some of this comes easier with something incomplete, as endings are uniquely hard to pull off for writers in general and me in particular, but still.)
So, if you tend to like things I like, much less things I make, you might really like this one. FWIW.
typicalacademic replied to your post: You know any decent books for modern neural net…
curious what you don’t like about AllenNLP—I’ve found it pretty workable? could totally believe that would change if I were working on slightly different tasks though
Oh, this is a fun question – at work I’m the only person who has used AllenNLP so I occasionally talk about it but I can’t really have a conversation about it
Cut for highly specific shop talk
Hmm… not books specifically, no. For all I know there are some good ones out there, but IMO modern neural net stuff is especially hard to distill usefully into books, and you just have to go to survey papers and code examples.
To some extent this is just because the field moves quickly. It’s also because so many neural nets used in practice today rely on “pretraining,” where instead of doing gradient descent “from scratch” with a completely random starting point, on you initialize a subset of your model parameters with those from a model someone else already trained on a large and fairly generic dataset in the same domain. (Here’s a cool paper comparing two different flavors of this for NLP tasks, including a few links to papers about the same kind of thing for images.)
Because you need to do this to get anywhere near state-of-the-art performance, the question “how do I make a good neural net model in 2019?” has an answer that is fairly entangled with the specifics of other people’s code and technology choices.
Like, you won’t merely be using a kind of model written by people at (say) Google – which is just some equations that you could implement yourself in principle. You’ll be using an actual array of numbers produced by those people, which you’ll download off the internet, and yeah in principle you could write your own loader for that array and hook up the loaded numbers to the right ones in your own code, but in practice that might involve reverse engineering half of TensorFlow or something horrible like that. Realistically, of course, you’re going to use the actual code and packages used by the Google-or-wherever people, which means committing to many of their specific software choices – or you’ll use the code of some 3rd party which can port the same numbers into some other specific software environment.
In short, the core model design stuff is not the hard part (and is continually changing anyway), the hard part is getting familiar with one of the specific software packages you have to use to actually do what all the current papers do, choosing the right one, discovering everything that sucks about it and how to work around that, and similar things.
Anyway, for text classification, you can get up and running with just pretrained word vectors plus some Conv or LSTM layers; if you want to do better you then replace the word vectors with ELMo, and if you want to do even better you replace the whole thing with BERT.
To get started learning about the types of models people use in NLP, at least before the “BERT for everything” transition, a good start is the papers associated with the trained models in AllenNLP, linked from this page and with live demos here. I don’t necessarily recommend using AllenNLP the package (speaking from experience here), but a nice thing about the project is that they try to reproduce lots of state-of-the-art models inside of it, so they’re a good resource for what those look like.
I don’t know as much about images – I think people start with pretrained models that were trained on ImageNet and there’s a few that are maybe the standard ones?
W/r/t the software choices, a good place to start is to set up Keras with the TensorFlow backend and then run through some tutorials. When I talked earlier about existing stuff you’ll be forced to work with, that code will probably either be TensorFlow-based or PyTorch-based, and for the TensorFlow ones, Keras is a decent usability layer that can sort of shield you from immense pain and frustration of writing raw TensorFlow.
(Unfortunately, even here things are kind of ugly because there’s “Keras,” which interfaces with TensorFlow and a few older frameworks, and then “tf.keras,” which is the Keras inside of TensorFlow that’s a different project, and then TensorFlow has its own other usability layer called “Models” or something, and also I think TensorFlow 2.0 is an attempt to make TensorFlow itself vaguely usable by human beings, but it’s quite new and I’m sure it has it’s own horrors. Or you could just choose PyTorch. But “Keras (not tf.keras) with TensorFlow backend” is the actual setup I can vouch for personally as being reasonably friendly)
Edit to add: I also recommend this famous blog post as a reference for all the different fancy gradient descent algorithms people use in the area. Important because all current work uses one of them (and so yours will), usually Adam.