


hey what the fuck does this mean
im shaking what does this mean
Material Evidence’s website described it as a traveling exhibition that would reveal “the full truth” about the civil war in Syria, as well as about 2014’s Euromaidan revolution in Ukraine, through a combination of “unique footage, artefacts, video.” I clicked on the Material Evidence talk and saw that a number of other trolls had been invited, including my old friend I Am Ass.
I still don’t really understand why Cambridge Analytica is supposed to be such a big deal. Sure, I understand why people object to them obtaining personal data under false pretenses. But I don’t understand the leap from “they have the data” to “they are puppetmasters controlling people’s voting behaviors.”
Targeted advertising based on internet user data is, of course, a hot area that is attracting a lot of investment these days. But that does not mean it works very well yet, or that it constitutes some vast leap in effectiveness over traditional marketing. I feel like every other day I hear someone half-facetiously lamenting how badly targeted their ads are. (Sometimes it isn’t facetious at all – it would be nice, of course, to learn about products I actually want to buy. And yet it almost never happens via online ads!) Despite the best efforts of the many people working on this problem, and the legions of automated trackers stalking us online, most of us still become aware of ad targeting only when we notice something hilariously irrelevant popping up on every site we visit.
I am equally unimpressed by everything I know about what is going on under the hood. Around a year ago, I made a post poking fun at the data Facebook (ostensibly) shows to advertisers about me – in which “Toxicity” was listed as one of my hobbies, and “Travel, places and events” included several seemingly random places which I’ve never been or wanted to go (”Slovakia”) as well as, mystifyingly, “Time” (illustrated with a picture of an hourglass).
Checking the same page now, the results seem a bit better (although perhaps only because there is no “Travel, places and events” section anymore), although still hit and miss. (Many of the successfully identified interests, like “Tumblr,” are listed because they’re “apps I’ve installed” [on my phone? how does Facebook knows this?], but there are a lot of false positives even there – it also says I’ve installed Instagram, Zillow and Feedly [I haven’t, and haven’t even heard of the last one]. Under “Shopping and fashion,” amusingly, there is only one interest – “Hat” – although I’ve never been in the market for buying a hat online. “You have this preference because you liked a Page related to Hat,” Facebook explains.)
What about those spooky analytics services that can use our Facebook likes (or whatever) to predict our personality, intelligence, sexual orientation, etc.? I admit it’s noteworthy that this works as well as it does, but it still doesn’t work that well. A while ago I installed the extension Data Selfie, which sends your Facebook data to a few of these analytics services and shows you the results. As of this writing, it has ingested 272 hours of my Facebook use, including my likes, every word I type, and how long I spend looking at each item in my feed. Here is what it concludes about my Big 5 personality traits:

The green marks come from one service (I think it’s Apply Magic Sauce?), using the posts showing up in my feed; the yellow dots come from another service (I think Watson Personality Insights), using the text I’ve typed. The two give very different answers – for instance, the yellow one thinks I have extremely high openness and the green one thinks I have lower than 50th percentile openness.
On “political orientation,” well, it seems to have figured out that I might be “liberal”:

… on the basis of my news feed, which is chock-full of liberal and left-wing political posts. (For “religious orientation” it gives me 57% “None,” which relative to base rates is actually pretty good, I guess.)
Out of curiosity, I also sent my data directly to Apply Magic Sauce (based on the PNAS paper about likes) – Data Selfie says it’s using AMS, but I wasn’t sure which of its results came from where. This gave some amusing results, like “Your probability of being Female is 82%” (the API Data Selfie uses for gender, on the other hand, gives 72% probability of Male). Under “Education” (glossed as “Probability of having a personal or professional interest in a given field”), a breakdown of fields gives me a whopping 32% for “Art,” 20% above the population average and far higher than any of the others. (I score a measly 5% each on “IT” and “Engineering,” both below population average; I have a PhD in applied math and currently work in tech.)
AMS helpfully indicates which of my likes are especially influential on its decisions – the fact that I “like” Radiohead seems to be giving it, uh, mixed messages:


None of this should be surprising. Data science is really hard! Here’s a nice lecture I recently watched from someone at Booking.com, which employs over 100 (!) data scientists, working on tasks as seemingly innocuous as “figuring out whether someone cares about getting served breakfast at a hotel, and then using that to choose hotel recommendations for them”:
It turns out – and is not surprising, in retrospect – that even something as simple as this presents tough statistical difficulties, and requires a lot of hard thinking about how to get around sampling biases and correlation/causation disconnects.
Real data science looks less like mecha-Big-Brother and more like this talk: lots of hard work to figure out extremely simple things that any human could read instantly off the data. There is a magic leap in the public conversation about things like Cambridge Analytica, where we go from the knowledge that some organization has detailed information on lots of people (in itself, kinda scary) to the idea that they must, of course, be able to use it.
It is easy to assume, if you don’t think too hard about it, that if having detailed information on one person is scary, then having it on ten million people must be far scarier. There is a vague sense that anything we creepy one could do with data on a single person can be done simultaneously for all ten million, or that even creepier things could be done from the aggregate, through vaguely imagined “data mining” techniques.
But, of course, if you have data on one person, you could hire one person to interpret it, while you cannot hire ten million people to do the same job in parallel. So we try to get machines to do it. And the machines are very bad at it. The actual meaning of “data mining,” in this situation, is “trying to get a machine to even kinda sorta do a tiny piece of what a human might be able to do.” In practice, the best one can do is usually to get the machine to notice very broad demographic information – stuff like advertising baby-related items to all women in a large age range, which is an advance over advertising them to everyone, but is worlds away from the “micro-targeted” manipulation of Jonathan Albright’s fever dreams.
When we try to get smarter, the results tend to get worse, especially on the levels of individuals rather than broad aggregates. On average, you can predict a surprising amount about people from their Facebook likes. But on an individual level, this looks like concluding that I, nostalgebraist, count “Toxicity” among my hobbies, am interested in taking vacations to Slovakia and to somewhere called “Time,” am either 98th percentile or 44th percentile on Openness to Experience, and am either 28% or 82% likely to be female.
And remember, marketing has always existed, and it has always been used, among other things, for political campaigns. Perhaps marketing is a little better than it used to be, and perhaps that marginal improvement is making a marginal impact on the effectiveness of political campaigns. Even that would fail to scare me – after all, we are talking about showing people the equivalent of leaflets that appeal to them, and if that is enough to sway an electorate, we were already screwed. But even that is not certain. Is marketing better than it used to be? I honestly don’t know.
Starting sometime around April 2016, I had a “mystery illness” that continued on and off for about a year.
There was a cluster of symptoms which would come and go – sort of like a cold or other upper respiratory infection, but distinguished by especially strong head and body aches (which were still strong after taking ibuprofen) and an intense physical lethargy that would make me spend much of the day in bed if I could get away with it. These symptoms would appear, last for maybe 2-4 days, then disappear again, and come back the next week, or two or three weeks later. At times when I had a regular Monday-to-Friday schedule, it would tend to appear on Thursday or Friday, which made me think it had something to do with stress or mild sleep deprivation building up over the week.
At first I just assumed it was a series of colds, then I thought it might be mono (got tested, it wasn’t). I later saw a few specialists for it, including a neurologist (had no idea) and a otolaryngologist (who did a CT scan of my sinuses and said they looked normal, although the aches were especially strong in the sinus area). Sometime in early-to-mid 2017 it just disappeared, which was nice, but I still don’t know what it was or whether it will come back someday.
Anyway, I’m posting about this because I’ve been having somewhat similar symptoms for the past two weeks, and it’s got me worried. I don’t think it’s the same thing – I’ve been feeling especially tired, but I don’t have the aches, and it seems like everyone’s sick right now (people were talking about it in the office last week, and then this morning my dentist asked me if I’d had “that thing that’s going around”). But it does make me want to ask if anyone has any special insights into this kind of illness. (I imagine I’ve already asked about this, back in 2016 or ‘17 – if I didn’t, I should have)
I looked up Jonathan Albright, the Columbia researcher who worked with BuzzFeed on that “Russian trolls” article, and his Medium blog is … well, it’s sure something, all right
(Between him and Eric Garland, is this like a … genre, now?)
These “Russian trolls” I’m reading about sound basically just like people having opinions online, and pretty good opinions, even
What if it turns out that, in fact, everyone on tumblr.com who has ever made a good post is in fact a Russian government agent, and that the only real Americans on here are a collection of, like, approximately eight neo-Nazis and three North Korea stans
When the system was completed it was possible to follow all the movements of the ‘rat’ within the maze and it was only through a design fault that it was found more than one 'rat’ could be introduced which would then interact together. After various casual attempts the rats started 'thinking’ on a logical basis helped along by reinforcement of correct choices made and the more advanced rats would then be followed by the ones left behind.
should “how <i>stupid</i> one would have to be to accept the two theses” be “how stupid one would have to be be *not* to accept the two theses” ?
Fixed, thanks!