Install Theme

evolution-is-just-a-theorem:

nostalgebraist:

That said, I think we need a healthy skepticism towards image classification results, too.  State-of-the-art architectures like Inception seem to work by learning features that reliably discriminate each object from all others, but fall short of defining that object.

They are good at, say, recognizing a dog no matter what posture the dog is in – but they do this not through an understanding of body kinesiology (”a dog’s body can deform in this way and still be a dog, but not that one”), they do it by identifying the least deformable part of a dog and then looking just for that part.  This is why Deep Dream images were so full of dog heads, specifically.  (And of eyes in general.)

The characterizations they learn are like the “featherless biped” definition of a human – actually very good at discriminating humans from other things in almost every circumstance, yet clearly not capturing the fundamental concept, which means they produce false positives on things that are not even close to human (like Diogenes’ plucked chicken).

Auto-generating “plucked chicken” examples has become its own pursuit in neural net research, under the name of “adversarial examples.”  A recent paper (many thanks to whichever tumblr user put this on my dash) shows that you can get your robo-Diogenes to produce 3D objects that are basically always misclassified by some network, even when seen from different angles and under different lighting.

For example, they made this weird turtle, which the network reliably thinks is a rifle (seriously), in all three of these pics and many others:

image

It’s easy to come up with a plausible-sounding story for why this worked.  This turtle has a pattern on its back, but it’s not the very distinctive pattern we’re used to seeing on turtles.  And so of course the network learned to recognize turtles by looking for that pattern, because it’s a great way to reliably tell them apart from other roundish things with four feet and a head.  But this is a “featherless biped” type of definition, and can be defeated by a turtle that doesn’t have the pattern.

I think your attempt to figure out why this particular turtle is classified as a rifle is a mistake. Interpretability is hard, and it is very rare for the features a net learns to map cleanly to the features a human uses.

Is this true?  When you do gradient ascent to produce an image that is given high probability for some class, you get forms that that are quite recognizable to a human.  There were some pictures of this in the first part of Google’s “Inceptionism” post, but my favorite examples are from Audun M. Øygard’s implementation of the same idea:

image

“Screws”

image

“Pug”

There are many others behind the link; I wanted to reproduce more here, but because this is a “text post” tumblr refuses to put multiple images in a single row, and I don’t want the post to be too long.

In these examples, I note two things: (1) I can actually see the object, or at least what looks like part of it, and (2) it tends to spam copies of the most distinctive part of the object rather than the whole thing, which fits with my contention in the OP.

(Although one of the examples is “Loggerhead turtle” and it has a bunch of flippers and not very distinct shell patterns!  But there may be a difference between “Loggerhead turtle” and just “turtle” as classes, or it may reflect a difference in versions of the Inception architecture?  Would be possible in principle for me to test this myself, I guess)

(via just-evo-now)

Hey, this post may contain sexually explicit content, so we’ve hidden it from public view.
Learn more.

Hey, this post may contain sexually explicit content, so we’ve hidden it from public view.

Learn more.

lilacsinthedooryard:
“ Ida Rentoul Outhwaite (Australia,1888-1960)
The Storm
”

lilacsinthedooryard:

Ida Rentoul Outhwaite (Australia,1888-1960)

The Storm

(via luminous-void)

picdescbot:
“ a large passenger jet flying through the air on a cloudy day @picdescbot | about this bot | picture source
all text in this post is 100% computer-generated, including tags
”
This is a bot that does those descriptions I was talking...

picdescbot:

a large passenger jet flying through the air on a cloudy day

@picdescbot | about this bot | picture source

all text in this post is 100% computer-generated, including tags

This is a bot that does those descriptions I was talking about, BTW

collapsedsquid:

nostalgebraist:

So … why aren’t deep neural nets better at language?  They’ve done well at object recognition, a sort-of-analogous process in the visual system, where you synthesize lower-level features into higher- and higher-level ones until you end up recognizing very complex and nontrivial patterns.  And they’ve done well at all kinds of tough things recently (like Go).

It’s tempting here to speculate about ways that language comprehension/production is deeply different and intrinsically human – say, that it requires our entire model of the world.  Certainly this is required for a true semantic understanding, and it is possible to find examples where it is required for syntax too (Winograd schemas).  But for the typical case of purely syntactic reasoning, how important is it, really?

Yet neural machine translation systems still produce ungrammatical sentences much of the time.  (A lot of Google Translate language pairs are done neurally now, and they are still this way.)

And I would have, a priori, made the same objection to some of the higher-level visual concepts that deep nets can apparently learn.  Stuff like “this is a picture of a herd of elephants walking across a dry grass field” – surely that would require some knowledge of body physiology, ability to do shape-from-shading to recognize elephants and fields under arbitrary lighting conditions, etc.?  But visual neural nets can do pretty well at generating these complex descriptions (itself a minor linguistic feat) – see paper here, demo here (their live demo doesn’t seem to be working.

Again, it’s tempting to say that this is a deep problem, machines have always had trouble with language, etc.  But let’s put on neural-net-enthusiast hats for a minute.  What if we really are just missing some basic trick of network architecture here?  Are there facts about the architecture of the human auditory cortex that might be suggestive?

I think your post with images suggested the problem with language is, the possibility space is much higher because sentences are a combinatoric problem.  That means that while you can “cheat” on the images you can’t do it for language.

Yeah, I’m thinking so.

The second post about images arose from me trying to think (in writing the first post) of complex visual features that nets could reliably distinguish (to contrast with language failures), and then thinking “wait, are they really so good at these things after all?” when I dug into some of the examples I had in mind.

@redantsunderneath

You need to see Skyfall - your conflict with the idea of the character is pretty close to the character’s inner conflict in that film.  It is pretty much what the thing is about.  The point of Skyfall is that there clearly is a place for Bond but he needs to find it.  The conclusion is that he is not irrelevant at all but he would be if he didn’t adapt which takes some soul searching because he has resisted change for so long.  He is unique and valuable and part of what makes him that is that he is a throwback, but he needs to decide how those things function in the new world to see where his value lies. The self destructiveness (which kind of sucks and is the worst part of the earlier Craig movies) is overcome as it is a byproduct of trying to be relevant when you are so rooted in the past.  In order to keep his identity intact (this goes for the Bond character, the idea of a Bond movie, the franchise, and the story consumer) he can’t stand still while the world changes.  Letting him become an anachronism doesn’t make the character stronger.  I think the movie is about getting past this.

I saw my first Bond movie in the mid 70’s (Diamonds are Forever on TV) and loved it.  I’ve seen all of them, the majority multiple times.  But I realized at some point that I could never remember the plot a day later, as in I couldn’t tell someone what happened in the move other than “they have a car chase in Italy then they fight on a rooftop.”  The plot I could remember always had to do with really broad points based on locations or setpieces.  Goldfinger had something to do with Goldfinger wanting to break into fort Knox. I don’t know why Bond went to Miami, just that he slept with a girl who he found painted gold. I guess I chalked this up to “feature-not-bug” but I do remember overvaluing Living Daylights because the plot stuck with me for a few days.  

I think I’ve always secretly wanted a Bond movie that had structure, a complete story, an arc, internal coherence… however you want to say it.  It’s just like maybe some people would want McDonalds to have better bread or higher quality chicken nuggets without feeling that that would wreck everything about the experience.  And Skyfall brought that.  

I think the earlier 2 Craig movies were about reclaiming lost aspects of Bond and trying to make them fit on the fly, working the character “hot” i.e. adding parts and swapping stuff while they were the system is running, and it was sometimes awkward.  Skyfall was a movie that dealt with getting all of that stuff to function optimally which required “rebooting” (on my metaphor’s terms).  

The less said about Spectre the better.

Reblogging to say I watched Skyfall on your advice, and I loved it.  Such a fun movie, especially the way that so many things relate back to the main theme, to the point that many individual interactions and even visual design elements (the retro car) can be read as sparring between the primordial cosmic forces of Bond-is-worthless-now and Bond-is-still-the-man.

I would say I now know what it’s like to enjoy a James Bond movie, except so much of it was about calling the character into question in ways that are actively antithetical to the character/franchise – if we cheer on the Bond-is-still-the-man force we are cheering, in part, for this questioning to stop.

Also, god, it was fucking weird how much the villain in that movie resembled my last therapist (who I liked).  Similar face, similar facial expressions and body language, superficially similar accent (though from a totally different origin), even a similar way of talking about things (modulo the supervillainry), a similar “tough love from a position of calm confidence” attitude.  (Of course it helped that part of the villain’s gimmick was acting as a sort of therapist.)

(via redantsunderneath)

The other thing they made in that paper: a baseball that the network thinks is an “espresso”

The other thing they made in that paper: a baseball that the network thinks is an “espresso”

That said, I think we need a healthy skepticism towards image classification results, too.  State-of-the-art architectures like Inception seem to work by learning features that reliably discriminate each object from all others, but fall short of defining that object.

They are good at, say, recognizing a dog no matter what posture the dog is in – but they do this not through an understanding of body kinesiology (”a dog’s body can deform in this way and still be a dog, but not that one”), they do it by identifying the least deformable part of a dog and then looking just for that part.  This is why Deep Dream images were so full of dog heads, specifically.  (And of eyes in general.)

The characterizations they learn are like the “featherless biped” definition of a human – actually very good at discriminating humans from other things in almost every circumstance, yet clearly not capturing the fundamental concept, which means they produce false positives on things that are not even close to human (like Diogenes’ plucked chicken).

Auto-generating “plucked chicken” examples has become its own pursuit in neural net research, under the name of “adversarial examples.”  A recent paper (many thanks to whichever tumblr user put this on my dash) shows that you can get your robo-Diogenes to produce 3D objects that are basically always misclassified by some network, even when seen from different angles and under different lighting.

For example, they made this weird turtle, which the network reliably thinks is a rifle (seriously), in all three of these pics and many others:

image

It’s easy to come up with a plausible-sounding story for why this worked.  This turtle has a pattern on its back, but it’s not the very distinctive pattern we’re used to seeing on turtles.  And so of course the network learned to recognize turtles by looking for that pattern, because it’s a great way to reliably tell them apart from other roundish things with four feet and a head.  But this is a “featherless biped” type of definition, and can be defeated by a turtle that doesn’t have the pattern.

So … why aren’t deep neural nets better at language?  They’ve done well at object recognition, a sort-of-analogous process in the visual system, where you synthesize lower-level features into higher- and higher-level ones until you end up recognizing very complex and nontrivial patterns.  And they’ve done well at all kinds of tough things recently (like Go).

It’s tempting here to speculate about ways that language comprehension/production is deeply different and intrinsically human – say, that it requires our entire model of the world.  Certainly this is required for a true semantic understanding, and it is possible to find examples where it is required for syntax too (Winograd schemas).  But for the typical case of purely syntactic reasoning, how important is it, really?

Yet neural machine translation systems still produce ungrammatical sentences much of the time.  (A lot of Google Translate language pairs are done neurally now, and they are still this way.)

And I would have, a priori, made the same objection to some of the higher-level visual concepts that deep nets can apparently learn.  Stuff like “this is a picture of a herd of elephants walking across a dry grass field” – surely that would require some knowledge of body physiology, ability to do shape-from-shading to recognize elephants and fields under arbitrary lighting conditions, etc.?  But visual neural nets can do pretty well at generating these complex descriptions (itself a minor linguistic feat) – see paper here, demo here (their live demo doesn’t seem to be working.

Again, it’s tempting to say that this is a deep problem, machines have always had trouble with language, etc.  But let’s put on neural-net-enthusiast hats for a minute.  What if we really are just missing some basic trick of network architecture here?  Are there facts about the architecture of the human auditory cortex that might be suggestive?

The Craft & The Community - A Post-Mortem & Resurrection →

bendini1:

oligopsoneia:

nostalgebraist:

This post is gigantic (I admit I only skimmed it), contains wild and massive claims made with little to no firsthand experience, could stand to have the personal elements more clearly delineated from the ostensibly objective-descriptive elements, and I don’t know if even half of it would withstand any kind of close inspection –

– but, once again, I’m very pleasantly surprised that the rationalist community is talking openly about why the “go out and change the world” thing did not happen, rather than sweeping it under the rug

(Now going beyond mere talk, there’s the part I’m not yet optimistic about)

There’s an assumption buried right near the beginning that Taking Ideas Seriously means “the Rationality Community” should be a center of agency, rather than a social scene that people hang out in sometimes because it’s fun (while fulfilling positive moral obligations and the like elsewhere.) This is just flabbergasting to me. 

Now kids, which one of these two tumblr users actually read the sequences?

I suspect @oligopsoneia has read more of the sequences than I have, tbh