Install Theme

@slatestarscratchpad

This is something a lot of people have complained about.

There are things that I get the impression happen all the time - for example, sophisticated people criticize New Atheism for not engaging with religion on the right level. Or people freak out because they got a bad score on an IQ test. Or people get called “autistic” as an insult. Or whatever. I know this happens from lived experience / seeing it again and again.

And I want to talk about that as an example of something, but I know from bitter experience that if I claim something happens, then people who find my point inconvenient will say it practically never happens and I’m making it up.

So then I link to ten examples of it happening, in relatively famous publications that seem like a good cross-section of the culture, and people tell me this is annoying, or weak-manning, or something.

What exactly am I supposed to do here? How do other people handle this?

I think it’s just a matter of connotation.  As a hypothetical, if you were to literally include a footnote to this exact text block of text every time you wrote a long chain of links, that would be … strange, and not a good idea, but it would completely clear up the problem.  The problem is that, on the page without such explanation, it looks more like you’re saying that the profusion of links should constitute strong, or even conclusive, evidence to the reader.  Rather than being weak evidence, but better than nothing.

If anything, using fewer links might help, or just prefacing with something like “here are a few more or less random examples,” where “a few” clarifies that the sheer quantity is not meant to have much convincing force, and “random” suggests that they are selected – like the proverbial colored balls from an urn – out of a much larger pool of your experiences, some of which happened IRL or are otherwise impossible to hyperlink.

(I’m not saying you never use that kind of phrasing, just that more would be welcome.)

(via slatestarscratchpad)

I don’t think about it much anymore, but it’s astonishing to me that the norm of “heterosexual men ask women out, not vice versa” is still so strong everywhere.  Including in places that are gender-nonconformist in other ways (e.g. young liberal urbanites on OKCupid, your average liberal arts college).

Like, I had some agonized times in college wondering why I could never get a date, and yeah there were various reasons, but one of them was just this hilariously trivial misunderstanding: I figured, since I was in this super-progressive environment, stuff like that had already been evened out to 50-50 and so a typically attractive guy could expect to get propositioned by a woman at least once in a while, and so I must have been less than typically attractive.  Again, in some ways I probably was, but I think that was much less than half of the problem.

It was pretty weird when, much later, I was trying to figure out how to use a dating website and people would tell me, yeah, you’ve gotta send out lots of messages out of the blue, after all no woman is ever gonna message you, and that’s not about you it’s just how it is – and I would think, “even in these crowds?”  Yes, in these crowds, in every crowd on this vast variegated earth.

You’d think this would be such a heartwarming, across-the-aisle issue, too!  And maybe there are deep obstacles here involving, I don’t know, male/female size/strength disparities, or the ~biotruths~ of desire, or something?  But if we’re already going around smashing things that look like they might be social constructs to see what happens, why not give this one a try, right?

kontextmaschine:

bambamramfan:

the-grey-tribe:

I feel called out by Hotel Concierge, but at the same time, I’m starting to see through HC’s and Scott’s respective writing styles.

HC is easy. HC is channeling TLP.

Scott writes in fits and starts, and in the latest flurry he started to cut corners that laid bare the ever-same skeleton of his essays, and then he wrote posts that were just a single idea, without any additional scaffolding.

Once in a history test I wrote a nested outline at the beginning of my essay on the causes of World War One. I should have completely erased it, because I did not finish the structure in time, so I just crossed out the outline and skipped two sub-headings. My teacher knew what I had left out, and deducted points accordingly. To this day, I’m certain if the outline had been completely unreadable, not just crossed out, I would have gotten a B instead of a C. I’m not bitter about this after all these years, but there was a lesson to be learned, and I learned it: Do not make the negative space in your concept map too obvious if you want to impress people. The systematising mindset is not your friend here.


HC has ten good ideas in five posts, but only six if you don’t count duplicates, and none of these are new, not even the duplicates. I think I’m being so hard on these people because the quality of their ideas fluctuates wildly, while their writing stays the same. My writing fluctuates too. I am not a wooden persona. I am a real boy.

I think there’s a lot of similarity in the writing styles of hotelconcierge, kontextmaschine, raggedjackscarlet, TLP, and to some degree balioc (though quality varies widely.) bal said it could be called “working class intellectual”

It’s writing that’s completely secure that the author has no need to convince you, you’re here to listen or not and that’s not their problem. It’s cynical about almost every political movement, and looks at internal sources of happiness rather than external. And it’s built on a foundation of a lot of non-topical knowledge, that makes for entertaining stories and feels less caught up in the current culture wars.

Man thinking of myself as “working class intellectual” feels all kinds of wrong, not least ‘cause in cultivating this highhanded writing voice I always thought of it as aristocratic

Though university graduate/Wikipedia-dwelling essayist Karl Marx was pretty much this exact type and if that guy doesn’t count

The concept of the “organic intellectual” might come closer, distinguishing self-selected types analyzing things as they are experienced (particularly from their structural position in society) from the credentialed intellectuals committed to The Discourse as an interpretive lens and peer group

Honestly though, “accumulating a broad store of non-topical knowledge from here, there, abroad and ago, so that you can bring it to bear on society today” really was the point of the liberal arts tradition and the public intellectual, and if it seems so alien today maybe there was something to all the fears of their decline

I agree with the last paragraph here.

This is relatively boring as levels of analysis go, but I think the common thread here is a certain response to the nature of the blog format: short-form, but with a self-selecting audience.  You don’t really have enough time to present a detailed analysis of a suite of examples (the way many academic books do) – if you did, you wouldn’t put that piece of writing on a blog.  So your examples have to be quick and to-the-point, like in an op-ed.

But unlike an op-ed writer, you aren’t constrained by the boss’ demand that you be comfortably readable to a wide audience.  So how do you give your examples more idea-motivating force when you can’t go into depth?  You make them obscure (so the reader says “wow, I didn’t know that and it startles me, so maybe there’s something to this idea”), or lay them on very thick so the reader has the feeling that numerous disparate cultural stars are being revealed as part of one constellation.

Hotel Concierge and TLP are frustrating to me because they lay on the references very, very thick even when the underlying concept is often simple and/or banal.  The implicit argument is something like “dude, if you were aware enough to juggle all these worldly things in your head at once, you would agree with my interpretation.”  So you try to juggle them, and it’s a little dizzying, but there is the nagging feeling that the interpretation came first and the references are textual adornments, intended to have just this dazzling effect.

There is a legitimate, nontrivial way to use seemingly disparate examples to briefly illustrate a concept – namely, “this concept makes sense of all of these things, which suggests that it’s widely applicable.”  SSC at its best does this.  But SSC also sometimes tries to generate argumentative force by mere piling on of examples – like when a large number of hyperlinks in a short space is used to argue “this happens all the time.”  (They may really just reflect one of those mini-manias that are always afflicting writers on deadlines – if there is one thinkpiece saying a thing, you can bet there are ten, or will soon be.)

I’m sure there are historical examples of similar forms?  (“Pamphlets” in 18th/19th C?)  Marx again seems relevant – there is something familiarly “internet” in the way he would figure out his beliefs by yelling at people he disagreed with (cf.)

(via kontextmaschine)

Machine Intelligence Research Institute — General Support (2017) →

After their more careful review process in 2016, with 7 external reviewers and 4 papers under review, OpenPhil has just decided to give MIRI 2.5 times as much money for 3 guaranteed years because one ~super-special~ expert (unnamed) liked the Logical Induction paper

On the one hand, aaaargghhhhh

On the other hand, I’m curious about this unnamed reviewer, who is apparently  “generally regarded as outstanding by the ML community” – I can’t imagine they didn’t notice the same flaws in the LI paper that I did, so maybe they came up with some next-level glowing-brain take on why it’s actually good?

Everyone in Dallas is scared out of their minds over this whole ‘God of Death’ thing.

there is more than one kind of adversarial example

dedicating-ruckus:

nostalgebraist:

@dedicating-ruckus

Wanted to reply earlier, but God help me if I can navigate this site…

Looking more closely, you’re right that the reference turtles and the adversarial turtles are perceptibly different. But the features where they’re different still don’t have any obvious relation to the adversarial categories; there’s nothing even vaguely suggestive of “rifle” or “jigsaw puzzle” in the adversarial turtles. That argument is most colorable for the cat/computer image, and even there, if not primed with “computer” it certainly wouldn’t come to mind just when looking at the image. “Weird distortions on a cat”, yes.

(Another good example is the street-signs paper, https://arxiv.org/pdf/1707.08945.pdf. These are also real-world robust, and achieve adversariality with perturbations that are obviously perceptible but not at all related to the adversarial category.)

It’s likely true that robust adversariality requirements place a hard limit on how subtle the perturbations can be. But given the off-the-wall nature of the actual perturbations, I maintain that adversarial examples constitute evidence that the image-classifier NNs internally function in a manner entirely alien to human visual processing.

Yeah, the tumblr interface is a constant source of frustration.

I agree that the distortions often don’t have any obvious relation to the target category, although sometimes they do (the espresso-baseball had some espresso-like bubbly foam … although, then again, the reference baseball texture looked kinda like that too).

I don’t agree with your conclusion, but at this point this is partly (largely?) a matter of intuition and judgment calls.  One thing I count as evidence in the opposite direction is Inceptionism-type image synthesis where you ask the network to generate an image that maximizes some class probability.  I linked earlier to Audun M. Øygard’s post about this, and he has put up a convenient album with synthetic images for every one of the 1000 ImageNet classes.

If you look at the album, there are recognizable shapes in every one of them, and most of the time they’re recognizable as (parts of) the thing they’re supposed to be.  I am assuming he didn’t sit around tweaking each of the 1000 images, so this is “typical” output – if you ask GoogleNet to produce something of a given class, it will produce something we recognize.

I think this puts a certain kind of upper bound on how alien the NN features could be.  I don’t think that is inconsistent with the adversarial weirdness – rather, I think these nets really are identifying structured, relevant features, they just don’t have enough of them to reliably make the many (~1 million for 1000 classes) pairwise distinctions we want them to make.

So, you ask them “give me the most panda-y panda you can imagine,” they happily draw a panda.  They’ve got an idea what something in the center of that cluster looks like.  But they do have trouble policing the boundaries between the panda cluster and the 999 other ImageNet clusters.  In other words, they have the cluster centered in about the right place in image-space, but they don’t have its boundaries shaped correctly – two different issues.

This makes sense if you remember that, after processing an image to extract fancy high-level structured features, these nets are still just doing logistic regression (AKA “linear layer + softmax”) on those features to get categories.  They’ve got these beautiful high-dimensional vectors that can encode complex shapes and textures, and then they’re just drawing linear decision boundaries in that space to try to separate each thing from all the others.  These boundaries are going to be pretty crude/bad in some cases, but that has nothing to do with the (un)naturalness of the feature space.

(Obligatory link to Christopher Olah’s amazing post where, among other things, he suggests doing KNN on the last layer)

Looking at those generated images, I think I agree that human-visible features are in the neural net’s criteria somewhere. I’m still pretty sure that the method they’re using to detect them is very different from the way humans work.

The entities involved in human visual processing map reasonably closely to entities that exist in reality. It works with an understanding of 3d kinematics, detects edges and uses them to impute 3d objects, and so on. Meanwhile, it’s pretty clear that the primitives the NNs are acting on are entirely related to images of things and have no connection to the actual things. Those generated images, for instance, don’t resolve to any kind of sensible 3d object, but tile images of characteristic features wherever they kind of fit. Some people have compared adversarial examples to optical illusions in human visual processing; the two classes of image are very different, and in ways that reveal the differences in the systems. Adversarial examples fool the statistical correlations that an image classifier builds up, while optical illusions fool heuristics in the brain that assume an image is a view of a real 3d object.

I’d also dispute that characterization of the net’s partition of image-space. It’s obviously not partitioning the space anything like a human would. A stop sign with four or five black and white rectangles on it is (to a human) a somewhat noncentral stop sign, but still a stop sign and still definitely not a speed limit sign. Meanwhile, the “most central” examples of the categories according to NNs looks nothing like an actual image of the category; it’s a piece of static with some characteristic features randomly plastered on top.

(I’m now wondering if you could make up something that doesn’t look like an object at all, to a human, but still pings the network as central in a category. A piece of roadside debris that scans as a stop sign?)

I said I would take a break from deep learning posts, but I did want to reblog to say I agree with all of this.

(via dedicating-ruckus-blog)

When I start typing one of these deep learning posts it always ends up taking longer to write than I thought, and this is becoming a time sink.  So: no deep learning posts for the next week.

(No offense meant to anyone who engaged with my posts, I just get easily fixated on things.)

there is more than one kind of adversarial example

@dedicating-ruckus

Wanted to reply earlier, but God help me if I can navigate this site…

Looking more closely, you’re right that the reference turtles and the adversarial turtles are perceptibly different. But the features where they’re different still don’t have any obvious relation to the adversarial categories; there’s nothing even vaguely suggestive of “rifle” or “jigsaw puzzle” in the adversarial turtles. That argument is most colorable for the cat/computer image, and even there, if not primed with “computer” it certainly wouldn’t come to mind just when looking at the image. “Weird distortions on a cat”, yes.

(Another good example is the street-signs paper, https://arxiv.org/pdf/1707.08945.pdf. These are also real-world robust, and achieve adversariality with perturbations that are obviously perceptible but not at all related to the adversarial category.)

It’s likely true that robust adversariality requirements place a hard limit on how subtle the perturbations can be. But given the off-the-wall nature of the actual perturbations, I maintain that adversarial examples constitute evidence that the image-classifier NNs internally function in a manner entirely alien to human visual processing.

Yeah, the tumblr interface is a constant source of frustration.

I agree that the distortions often don’t have any obvious relation to the target category, although sometimes they do (the espresso-baseball had some espresso-like bubbly foam … although, then again, the reference baseball texture looked kinda like that too).

I don’t agree with your conclusion, but at this point this is partly (largely?) a matter of intuition and judgment calls.  One thing I count as evidence in the opposite direction is Inceptionism-type image synthesis where you ask the network to generate an image that maximizes some class probability.  I linked earlier to Audun M. Øygard’s post about this, and he has put up a convenient album with synthetic images for every one of the 1000 ImageNet classes.

If you look at the album, there are recognizable shapes in every one of them, and most of the time they’re recognizable as (parts of) the thing they’re supposed to be.  I am assuming he didn’t sit around tweaking each of the 1000 images, so this is “typical” output – if you ask GoogleNet to produce something of a given class, it will produce something we recognize.

I think this puts a certain kind of upper bound on how alien the NN features could be.  I don’t think that is inconsistent with the adversarial weirdness – rather, I think these nets really are identifying structured, relevant features, they just don’t have enough of them to reliably make the many (~1 million for 1000 classes) pairwise distinctions we want them to make.

So, you ask them “give me the most panda-y panda you can imagine,” they happily draw a panda.  They’ve got an idea what something in the center of that cluster looks like.  But they do have trouble policing the boundaries between the panda cluster and the 999 other ImageNet clusters.  In other words, they have the cluster centered in about the right place in image-space, but they don’t have its boundaries shaped correctly – two different issues.

This makes sense if you remember that, after processing an image to extract fancy high-level structured features, these nets are still just doing logistic regression (AKA “linear layer + softmax”) on those features to get categories.  They’ve got these beautiful high-dimensional vectors that can encode complex shapes and textures, and then they’re just drawing linear decision boundaries in that space to try to separate each thing from all the others.  These boundaries are going to be pretty crude/bad in some cases, but that has nothing to do with the (un)naturalness of the feature space.

(Obligatory link to Christopher Olah’s amazing post where, among other things, he suggests doing KNN on the last layer)

(via dedicating-ruckus-blog)

femmenietzsche asked: I know this doesn't get at the interesting underlying issues, but as a practical matter couldn't you just look at each image with 2 or 3 somewhat different neural nets to get around adversarial examples? Surely it would be extremely difficult for a given image to trick multiple programs in the same way.

This is an important question, and surprisingly, it is not difficult!

That first paper I linked, “Explaining and Harnessing Adversarial Examples,” talks about it:

An intriguing aspect of adversarial examples is that an example generated for one model is often misclassified by other models, even when they have different architectures or were trained on disjoint training sets. Moreover, when these different models misclassify an adversarial example, they often agree with each other on its class. Explanations based on extreme non-linearity and over- fitting cannot readily account for this behavior—why should multiple extremely non-linear model with excess capacity consistently label out-of-distribution points in the same way? This behavior is especially surprising from the view of the hypothesis that adversarial examples finely tile space like the rational numbers among the reals, because in this view adversarial examples are common but occur only at very precise locations.

Under the linear view, adversarial examples occur in broad subspaces. The direction η need only have positive dot product with the gradient of the cost function, and need only be large enough. Fig. 4 demonstrates this phenomenon. By tracing out different values of we see that adversarial examples occur in contiguous regions of the 1-D subspace defined by the fast gradient sign method, not in fine pockets. This explains why adversarial examples are abundant and why an example misclassified by one classifier has a fairly high prior probability of being misclassified by another classifier.

To explain why mutiple classifiers assign the same class to adversarial examples, we hypothesize that neural networks trained with current methodologies all resemble the linear classifier learned on the same training set. This reference classifier is able to learn approximately the same classification weights when trained on different subsets of the training set, simply because machine learning algorithms are able to generalize. The stability of the underlying classification weights in turn results in the stability of adversarial examples.

there is more than one kind of adversarial example

I said this in an earlier reply, but the issue keeps coming up, so I should make a dedicated post about it.

First: an “adversarial example” is a example designed to make a given, already-trained classifier look maximally foolish.  They are constructed (roughly) by taking an existing example that the classifier can correctly classify, and making the smallest possible change that will lead the classifier to misclassify it as something else (with high probability).

Second: not all adversarial examples are created with the same goals in mind.  So there are different kinds of adversarial example, and they may not all reflect the same “problem” with the classifiers they target.

The first adversarial examples I learned about were the kind exhibited in this image:

image

Here, we can take an image of a panda, add a tiny perturbation that is imperceptible to the human eye, and get the network to misclassify it as a gibbon.  This freaked everyone out, including me.

Now, does this type of adversarial example show that neural nets are deeply alien, that they’re doing something totally unlike human vision?  Well, no.  This picture comes from the paper “Explaining and Harnessing Adversarial Examples,” which provides a neat explanation of this phenomenon via a scaling argument for dot products in high dimensional spaces.  This argument applies far beyond neural nets – in particular, to simple linear classifiers as well.  (See Fig. 2 in that paper, where they do the exact same thing to a logistic regression classifier.)

Informally, the argument is basically the following: “there are a huge number of input channels (i.e. pixels in the image).  If we write down a linearization of the classifier around a given input, this will tell us how much the output (i.e. class probabilities) varies with small changes to each input channel (pixel).  The impact of changing any one pixel will be small, but since there are so many pixels, we can get a large impact on the output by making all of the pixels push the output in the same direction.”

As the authors put it, the problem is that the classifier is too linear.  (Note how we assumed that the linearization was a good approximation.)  The problem isn’t that the classifier learns some bizarre nonlinear mapping which sends the tiny perturbation in image space to a giant jump in feature/embedding space.  The problem is sort of the opposite: it’s not nonlinear enough to be able to respond to a perturbation in the exact direction of the gradient with “wait, no, back off.”


Image classifiers see 2D images, but usually they’re photos of 3D scenes.  So naturally, people wondered whether we could perturb the actual object rather than the photo.  Since an object can be photographed from various directions and distances, and under different lighting conditions, the goal here is make an object that is maximally adversarial across these different conditions.

“Synthesizing Robust Adversarial Examples,” the paper with the turtle, does this by taking an expected value over a probability distribution of transformations.  They use this to synthesize 2D examples, with a transformation distribution supported over “rescaling, rotation, lightening or darkening by an additive factor, adding Gaussian noise, and any in-bounds translation of the image.”  And then they do it for 3D examples, with a transformation distribution supported over all these types of transformations:

image

Note that the earlier, linear argument does not apply, at least in direct unmodified form, to this sort of thing.  We are taking an expectation over a range of images (generated by these translations), many of which which will be quite far apart in image space.  Here is that turtle picture again:

image

All three of these images (and many others) were classified as “rifle” by the network, as intended by the authors.  As images, these three are not close together at all: the pixel in any particular (X, Y) location will typically have quite different values for all three.  And we don’t get to control the individual pixels of any one image; we only get to control the texture of the turtle, which maps to image pixels in a way that is nonlinear and dependent on the transformation.

If one were to linearize (as before) around any of these images, the gradient (in image space) would presumably be quite different.  Perhaps the adversarial turtle is still exploiting linearity, but if so, it is because the texture perturbation somehow manages to push the pixels in the gradient direction in each image, even though that direction is different for each image.  It is not obvious from the earlier argument that this is possible, so even if the turtle is doing this, we have learned something new here that we did not know from the earlier argument.


Moreover, I suspect that the turtle is not just exploiting linearity.  This is because the texture is not a tiny perturbation of some original, “correctly classified as turtle” texture.  It is perceptibly different.

I think I didn’t make this clear enough in the original post.  When I said the turtle looked “weird,” I really meant that the shell pattern was perceptibly different from the patterns on the correctly classified reference turtles.  Probably the clearest example of this in the paper is the side-by-side comparison in Fig. 3, between a correctly classified turtle and an adversarial turtle classified as “jigsaw puzzle” (here this images are rendered in software, rather than 3D printed).

To make sure you can see the difference on the tumblr dash, I’ve cropped one column of the figure and rotated it sideways:

image

The left is correctly classified, the right is misclassified as “jigsaw puzzle.”  Note how the difference is perceptible and structured.  Admittedly, the original shell shapes are still visible (which militates against my “shell shape” interpretation from earlier).  But it looks to a human like a different texture.  Likewise, look at the texture on the turtle in the 3D printed image above, and how different the shell shape is from the reference turtle.  We are a long way from the panda/gibbon here.


There seems to be a spectrum here: the more transformations you include, the more perceptible your perturbation has to be.  Between the panda/gibbon (imperceptible) and the turtle/jigsaw (very perceptible), there are some intermediate cases.  See the examples here, which only involve 2D transformations.

Here’s an adversarial example of cat that is not robust to zooming – if you zoom in as little as 2%, the network sees it as “tabby cat” and not as the adversarial “desktop computer”:

image

Here’s a version that’s robust to zooming.  I sure can’t tell the difference:

image

But what if we want it to be more robust – to “rotations, translations, scales, noise, and mean shifts”?  We get this:

image

Which is starting to have some perceptible, structured, and indeed “desktop computer”-y differences (check out that right angle).

And that’s just 2D transformations.  If you wanted to make a 3D model of a cat which would be misclassified as “desktop computer” from all angles and distances and under all lighting conditions?  I bet it’d be a pretty weird cat.