Install Theme

nostalgebraist:

A while ago, I recommended multi-backend Keras to someone asking which neural net framework to use.

I want to rescind that – my attitude at the time was “Keras kind of sucks but it’s not the worst and I have the most experience with it,” and now my attitude has moved to “Keras really sucks, Keras is BAD, use pytorch or if you have to use tensorflow just use raw ops”

I may elaborate later… this is just to “clear my conscience” :P

I reblogged this earlier with a bunch of words elaborating the claim, but then I removed it after a few hours … I guess I’m just feeling weird about becoming this guy who has a blog where he does ~Epic Software Rants~, and even as those go it was kind of unfocused and weird.

The short version:

Keras objects usually do pretty trivial things, like simple for-loops around tensorflow code.  Often, even this is buggy or feels incomplete, and it becomes obvious that writing your own version will be easier than trying to work around theirs.

The objects are hard to serialize, or have been historically anyway.  Compare the vast and complex Keras serialization doc to the tiny pytorch one.  The python parts of Keras don’t like to be pickled, and define their own serialization protocol with worse UX (I never want to see the phrase “custom objects” again).

The Keras project was originally trying to define an abstraction layer not tied to tensorflow, and now it’s tied to tensorflow but wants to be independent of python (because tensorflow aspires to be).  You lose the clarity and language-independence of tensorflow graphs, and no longer gain the portability across ML backends that Keras used to offer.

A tensorflow graph is a clearly scoped and defined concept, so if you know something is a tensorflow graph, that gives you various assurances.  Keras objects are usually glorified tensorflow (sub)graphs, yet they have arbitrarily shaped python utilities attached to them like malware, making it hard to reason about their exact behavior and contents.

Ultimately, writing down a neural net is just not that hard.  GPT-2′s architecture was specified as raw tensorflow ops and it is wonderfully straightforward, crisp, and readable.  Neural net code presents other challenges, mostly related to compute graphs, and Keras makes this worse by trying to hide what the graph is and how it got made.

A while ago, I recommended multi-backend Keras to someone asking which neural net framework to use.

I want to rescind that – my attitude at the time was “Keras kind of sucks but it’s not the worst and I have the most experience with it,” and now my attitude has moved to “Keras really sucks, Keras is BAD, use pytorch or if you have to use tensorflow just use raw ops”

I may elaborate later… this is just to “clear my conscience” :P

nostalgebraist/nostalgebraist-autoresponder →

OK, I put a slightly cleaned-up version of nostalgebraist-autoresponder on github… have fun trying to understand the convoluted horrors I’ve created :P

There are more detailed disclaimers in the README, but just to be clear, this doesn’t have the model/data files you’d need to reproduce my bot or create your own, and I can’t even guarantee it works the way the “live” version does with those files in place.

However, there are some terse hints and scripts and stuff which in principle would let someone reproduce the models or create analogous ones for a different bot, if that person feels comfortable getting into the awkward weeds of GPT-2 fine-tuning and its lack of mature tooling.

@the-real-numbers since you asked to be notified

@eightyonekilograms replied to your post“I am going to demonstrate a tumblr bug. This is the original post. I…”

Ah I think I ran into this inadvertently once. What’s the repro?

It works when

  • the post you are reblogging is “susceptible”
  • the first paragraph in your reblog is block-quoted
  • you do not use the Markdown editor when writing the reblog

I don’t know exactly what posts count as “susceptible,” but my guess is that “susceptible” = “was created as an NPF post.”

NPF is tumblr’s “v2″ implementation of how they store posts on their backend, see here and here.

When it was released, you could only create NPF posts using mobile clients.  I’m not sure this is still true, but the trick only worked when I created the OP on mobile and not on web, so if we assume “mobile = NPF” (plausible) and “NPF means susceptible” (plausible), it all adds up.

Why do I think this might be an NPF issue, besides the observed connection to creating on mobile?

In the “legacy” storage format before NPF, posts were stored as HTML, and the structure of reblogs was represented in nested HTML blockquotes.  Additionally, tumblr has not moved everything to NPF – as I implied above, some posts are “created as” legacy and some are “created as” NPF.  And they do some translation between the two formats in various places (exactly which places, I’m still hazy on) as posts are requested, modified, and reblogged across different clients.

The bottom line is, somehow this “trick” is hitting some part of their code that (rightly or wrongly) thinks it’s dealing with the old format where blockquotes encoded structure, and it’s misunderstanding a content blockquote as a structure blockquote.

—-

The only mitigation I know of is to go into the Markdown editor and save the post “as” Markdown.

(I have no idea what this does behind the scenes.  If I try to retrieve the post in NPF through the API it looks the same as before (the NPF looks “correct” in both cases), if I try to retrieve it in legacy/HTML format it’s fixed, and its display in the browser is fixed.)

Unfortunately, you don’t seem to be able to do the Markdown thing through the API, so my bot can’t do this.

eightyonekilograms:

shacklesburst:

sigmaleph:

nostalgebraist:

I imagine some people have been curious to hear more details about how @nostalgebraist-autoresponder works, so here’s a relatively complete post on that.  Very long.

Keep reading

this is quite interesting! and, separately, it’s quite validating that other people find tumblr’s API/pytumblr as frustrating as I do

Yeah, it’s what stopped me from starting the multiple bots I was thinking about implementing one time or another.

I find it highly suspicious that the Chinese characters you randomly chose almost perfectly encapsulate what you’re using them for (”friend” for username delimiting, “region” for post content delimiting, “meet” for ask stuff, “letter” for original post, … okay, simplified “duty” for tag delimiting is a bit of a stretch but it can also mean “post”, as in position, so still).

It sure is something when someone says “I want to build a machine learning model to imitate realistic human speech and then hook it up to Tumblr’s API” and the second part of that sentence is the harder technical challenge.

Hahaha… I mean, it is and it isn’t?  Like, there’s a similar reversal of intuitive difficulty when I do this kind of thing at work, even though we get to design the APIs there.

Doing impressive “machine learning” often amounts to script kiddie stuff – not much more than import StateOfTheArtModel; my_model = StateOfTheArtModel(); my_model.fit(x, y); – but creating a lasting, usable shared interface for anything is fundamentally hard and people spend their whole careers arguing about it.

I was going to say “this feels like that freshman year / senior year meme with Luke Skywalker,” but then I realized that “describing what it looks like in my head” is not the only thing one can do with a hypothetical meme image, so here’s this dumb thing I just made:

image

I imagine some people have been curious to hear more details about how @nostalgebraist-autoresponder works, so here’s a relatively complete post on that.  Very long.

—-

EDIT 5/28/20: I’ve added some things since this post was written, most notably the “mood” feature.  As of this writing, though, it’s still mostly complete.

—-

EDIT 8/23/21: I’ve added and changed a lot of things in the year-and-three-months since the last edit.  This post is still a decent overview of the broad strokes, but I should write a more accurate version sometime.

I try to keep the about page up to date, so you can use that as a reference if you’re trying to figure out if something in this post is still true

—-

For even more info, see the #nostalgebraist-autoresponder-meta tag, or send me an ask (although, unlike my bot, I can take a very long time to reply sometimes).

Keep reading

eightyonekilograms:

tototavros:

eightyonekilograms:

Software is not only not a meritocracy, in some areas it’s almost a perfect anti-meritocracy. Programming languages are the classic example: historically there’s almost a perfect anticorrelation between how good a language is and how popular it is. There are good reasons for this, but it makes a mockery of the claim by certain folks that software is a meritocracy and so therefore we should stop trying to X, Y or Z.

historically there’s almost a perfect anticorrelation between how good a language is and how popular it is

citation very much needed, sure, sure, C++ and Java are bad and too many people use them, but can you imagine people using K or Forth or TCL? Hell, even Prolog, Scheme, OCaml, and Haskell all have pretty serious problems that you should be willing to address before using them, and while I’d maybe take the latter 3 over C++ and Java for a new codebase–we have Python, we have Typescript–they’re not ideal for me, but they’re not bad, and they actually have functional ecosystems

Alright, so, the first thing I want to say is that this tendency has gotten much better in the past decade or so. “The worst languages rise to the top” is mostly a 1970-2005ish phenomenon. 

Really what I meant that rant means is that “programming languages were more likely to be popular if they were free-as-in-beer, but languish in obscurity if they were controlled by one company and you had to pay a lot of money for them”. And this is a good thing, in general - open stuff is better. But it did mean that a lot of the languages that become popular were thrown together by relative amateurs making something to suit their own needs, while the ones carefully designed by professionals tended to come out of big corporations who employed those professions, and they wanted money for them.

(That’s why it’s less of an issue today: now everyone understands that being FOSS is table stakes for a new language, and so even the languages backed by big companies are easy and free to jump in and develop on. Charging money for your compiler is unthinkable today but was an obvious choice back then)

Regarding citations, I think the examples of bad languages rising to the top speak for themselves. For many years, PHP was practically the only language for web programming even though it was an eldritch horror. Perl was huge for a while even though Perl code is more or less unreadable. And of course, there’s JavaScript.

On the other side of the coin, I would cite Ada as the classic example of a good language that was bit by “if it’s not free, nobody cares”. IMO there’s a good chance we would all be writing Ada instead of C++ if it had been free to get a hold of in 1980, and my god the world would be so much better off if that had happened. Also, Ada got done dirty in a smear campaign from both CS academics and West Coast hackers. I won’t get too into this because it’s all ancient history now, but most of the criticisms of Ada were the very same things that everyone praises Rust for today, and I think many of them were really coming from a place of cultural distaste from both of the above crowds for the military (Ada was a DoD project).

And I think it still supports my point about meritocracy: the inferior languages succeeded because of network effects and the up-front frictionlessness of interaction, even if you paid for that many times over later. Is it so hard to believe the equivalent thing happens to humans?

This is an interesting take, which I’m reblogging partly for that reason and partly to express my absolute bogglement at the PHP article linked within.

I’ve never used PHP or read anything about it, and … what the fuck??? this sounds like a parody of bad programming languages, like some INTERCAL-style satirical art project … like someone looked at Javascript, said “this just doesn’t feel enough like the PL equivalent of something a child scribbled while learning to use MS Paint,” and put their mind to creating something even closer to the Platonic ideal of Everything Programmers Hate, At Once.

People actually use this thing? Like, in 2020? (I’m so sorry??)

While I’m on the topic, here’s a few things I’d want to see in a hypothetical piece of software that’s trying to be “neural net frameworks done right”:

(cut for more shop talk)

Keep reading

[Attention conservation notice: machine learning framework shop talk / whining that will read like gibberish if you are lucky enough to have never used a thing called “tensorflow”]

I’ve probably probably spent 24 solid hours this week trying (for “fun,” not work) to get some simple tensorflow 1.x code to run on a cloud TPU in the Google-approved manner

By which I mean, it runs okay albeit slowly and inefficiently if I just throw it in a tf.Session() like I’m used to, but I wanted to actually utilize the TPU, so I’ve been trying to use all the correct™ stuff like, uh…

…“Datasets” and “TFRecords” containing “tf.Examples” (who knew serializing dicts of ints could be so painful?) and “Estimators” / “Strategies” (which do overlapping things but are mutually exclusive!) and “tf.functions” with “GradientTapes” because the “Strategies” apparently require lazily-defined eagerly-executed computations instead of eagerly-defined lazily-executed computations, and “object-based checkpoints” which are the new official™ thing to do instead of the old Saver checkpoints except the equally official™ “Estimators” do the old checkpoints by default, and oh by the way if you have code that just defines tensorflow ops directly instead of getting them via tf.keras objects (which do all sorts of higher-level management and thus can’t serve as safe drop-in equivalents for “legacy” code using raw ops, and by “legacy” I mean “early 2019″) then fuck you because every code example of a correct™ feature gets its ops from tf.keras, and aaaaaaaaaaaaaargh!!

This solidifies the impression I got last time I tried trusting Google and using fancy official™ tensorflow features.  That was with “tensorflow-probability,” a fancy new part of tensorflow which had been officially released and included cool stuff like Bayesian keras layers… which were impossible to save to disk and then load again… and this was a known issue, and the closest thing to an official reaction was from a dev who’d moved off the project and was now re-implementing the same thing in some newly-or-differently official™ tensorflow tentacle called “tensor2tensor,” and was like “uh yeah the version here doesn’t work, you can try tensor2tensor if you want”

(I still don’t know what “tensor2tensor” is.  I refuse to learn what “tensor2tensor” is.  They’re not going to get me again, dammit)

I don’t know whether the relevant category is “popular neural net frameworks,” or “large open-sourced projects from the big 5 tech companies,” or what, but there’s a certain category of currently popular software that is frustrating in this distinctive way.  (Cloud computing stuff that doesn’t involve ML is often kind of like this too.)  There’s a bundle of frustrating qualities like:

  • They keep releasing new abstractions that are hard to port old code into, and their documentation advocates constantly porting everything to keep up

  • The new abstractions always have (misleading) generic English names like “Example” or “Estimator” or “Dataset” or “Model,” giving them a spurious aura of legitimacy and standardization while also fostering namespace collisions in the user’s brain

  • The thing is massive and complicated but never feels done or even stable – a hallmark of such software is that there is no such thing as “an expert user” but merely “an expert user ca. 2017″ and the very different “an expert user ca. 2019,” etc

  • Everything is half-broken because it’s very new, and if it’s old enough to have a chance at not being half-broken, it’s no longer official™ (and possibly even deprecated)

  • Documentation is a chilly API reference plus a disorganized, decontextualized collection of demos/tutorials for specific features written in an excited “it’s so easy!” tone, lacking the conventional “User’s Manual” level that strings the features together into mature workflows

  • Built to do really fancy cutting-edge stuff and also to make common workflows look very easy, but without a middle ground, so either you are doing something very ordinary and your code is 2 lines that magically work, or you’re lost in cryptic error messages coming from mysterious middleware objects that, you learn 5 hours later, exist so the code can run on a steam-powered deep-sea quantum computer cluster or something

Actually, you know what it reminds me of, in some ways?  With the profusion of backwards-incompatible wheel-reinventing features, and the hard-won platform-specific knowledge you just know will be out of date in two years?  Microsoft Office.  I just want to make a neural net with something that doesn’t remind me of Microsoft Office.  Is that too much to ask?

I can’t imagine I’m the first person to have this idea, but: I’m starting to think that, at least with currently existing technology, it’s always a bad idea to think of your software as an “agent” instead of a “tool.”  And, on the flipside, that many (if not all) useless “agents” could be repurposed into useful “tools.”

The distinction I’m making is between two types of software that try to save you time:

A “tool” saves you time by giving you a (literal or figurative) “button” you can press (could be a command line string, whatever) which will trigger a constrained, broadly transparent, broadly predictable (if perhaps quite complicated!) string of automated actions, which you then won’t have to do yourself.

An “agent” instead tries to do some task for you entirely on its own, up to and including deciding when the “button” should be pushed, and then pushing it.  A tool is never used by default, only when you push its button (perhaps by implication, through pushing the button of a higher-level tool).  Agents push their own buttons, whether you want them to or not, and usually the only way to push back against an agent that is behaving annoyingly or uselessly is to turn it off entirely, depriving yourself of all its functionality even when you do want it.

Agents and tools are often quite similar in their actual capabilities, and in what they do after the button is pushed.  But agents are more opaque to the user, often on purpose (to make them seem “smart” and/or effortless to operate).  And the model of time-saving is subtly, but importantly different in the two cases.

A tool wants to multiply your capabilities.  It wants to let you do more in any given ten minutes by giving you a button that will instantly do something that used to take you (say) five minutes.  Now, every time you want to do that thing, it’s as if you’re getting five extra minutes for free.

An agent wants to replace you.  It wants to let you do more in any given ten minutes by making a conceptual breakdown of your work, completely automating a subset of it, and posing as a new coworker who handles that entire subset so you don’t have to.

Why are agents worse than tools?  First, because we’re really good at making computers do complicated-yet-constrained tasks, but we’re really bad at making them anticipate our needs.  “Deciding when to push the button” is usually very hard to automate – and strangely pointless to automate, too, when it takes all of half a second to push one.

And second, because the space of conceivable, useful tools is much larger.  As long as you leave room for some human volition, you can have all sorts of great ideas that can multiply human productivity by orders of magnitude.  If you insist at the outset that the human is going to be cut off from the system, then you won’t even think about any of the ideas that necessarily involve a human participant, even if they’re great ones.

Think about compilers and interpreters.  Once upon a time, you (more or less) had to write byte code if you wanted to program a computer.  These days, the journey from “concept in your head” to “code you can run” takes orders of magnitude less time, because we now have tools that will automatically translate much higher-level descriptions into byte code.  After their buttons are pressed, these tools do all sorts of very complicated, very fancy things entirely on their own, in a way that is quite “smart” if you want to frame it that way – it seems to me that GCC and LLVM are as worthy of the title “AI” as anything on the market these days.  But these things don’t pose as coworkers or assistants, they only run when you tell them to, and they limit their behavior to an easily comprehensible scope.

Imagine if people in the days of byte code had thought about “automated programming” in the agent model instead of the tool model, with the goal of entirely replacing parts of the programmer’s workflow.  Would they have invented programming languages at all?  “Why translate into byte code from a language humans find easier, when the goal is to write code without the human needing to lift a finger?”

Compilers and interpreters are complex tools, and they are wonderful.  Are there any agents that are similarly wonderful?  When a software feature is marketed as being “smart” (which seems to be a term of art for “agent”), doesn’t that usually mean “useless”?

(The phone app that came with my sleep tracker has a feature called “Smart Coach.”  Each day, in semi-random fashion, it gives me a new piece of advice based somehow on my recent data.  The software capabilities behind the advice look like they might well be very useful to me, but they have been rendered useless by wrapping them in an agent.

“Smart Coach noticed you tend to sleep more on weekends.”  Okay – but how much, and over what time period, and is there any reason you told me that just now?  An ability to see my data averaged by day-of-week (which the app is already computing, apparently) would be so much more useful than Smart Coach.  “Smart Coach noticed you got less deep sleep last night than usual for your age cohort.”  Okay, so how much deep sleep does my age cohort get, and just how much less am I getting?  The developers put lots of interesting information at my fingertips, and then systematically hid it from me, because they wanted an agent.)