Install Theme

(via injygo)

This year let’s remember the true meaning of Θāigraciš

squareallworthy:

image

Honestly the Old Persian Calendar is the most badass calendar. I don’t know why anyone uses anything else. Check out those month names.

“Hey, Cyrus, what are we doing for Vrkazana this year?”
“Killing wolves, what do you fucking think?”

Metal. 

And later on, they have an entire month for worshiping the nameless god. Fucking metal.

“Hey, you know, I’ve always hated that month that’s like middle-late winter. It’s still cold as hell and there aren’t even any good holidays.”
“I know, right? It sucks. What are we going to call it?”
“Let’s go with…Suckuary.”

Metal.

(via stumpyjoepete)

birdblogwhichisforbirds asked: the first is in unicellular but not in amoeba, the second is in rebuttal but not in refutation. what is it? (hint: you have one.)

Hint #2: I’m not the only one!  <3

typicalacademic:

nostalgebraist:

Automatic parsers for natural language are pretty good these days.  I use the spacy one all the time, and although it occasionally makes mistakes, it’s reliable enough that almost all of my parsing-related bugs come from code I put on top of it (or from ungrammatical input).

This makes me very curious why people don’t use them as components in deep learning architectures for text.  For neural machine translation, chatbots, etc., the popular models all use “attention” modules that emphasize certain parts of the (representation of the) input when producing each part of the output, or “self-attention” which a similar thing inside of the encoder and decoder (not between them).  This allows them to sort of learn how syntax works.  But everything is still tied to this idea of a sentence as a “sequence,” where you say “okay, I’m producing word #7, what information do I need to do that?”  

This is a weird question, because “word #7 in a sentence” is not a natural category, and the relevant information depends on what word #7 is doing syntactically, among other things.  (N. B. there are fancier positional encodings than just word #, but they’re all positional.)

If you’ve written the six words “I, who enjoy tasty food, will” then the next word is going to be a verb with word #6 as its auxiliary and word #1 as its subject, and words #2-5 are only relevant for semantic context.  OTOH if you’ve written “When choosing a restaurant, I usually” then the next word will be a verb with word #5 as its subject, word #6 as an adverb, and words #1-4 are only relevant for semantic context.  Etc.

It seems much more natural to have a decoder that makes a syntactic tree piece-by-piece, rather than a sequence, with the words in the tree ending up wherever they have to be.  Likewise, we could have the encoder to take a syntactic tree as input, and perhaps use tree-like structures for the latent representation.  This means we don’t have to learn grammar on top of the rest of the problem domain, it ensures grammatical output, and it gives us representations of long-range dependencies that don’t degrade as we insert arbitrary numbers of words (relative clauses, etc.) in between.  Since we have good automatic parsers, we can automatically make trees to feed to the encoder, and we can automatically make training data for the decoder even if we don’t have a hand-parsed corpus for the problem domain.

If I weren’t so busy I’d be trying this out myself (and probably running into all sorts of unexpected pitfalls, but that’s research for you).

#admittedly this is all only as good as the parser and the parser may well be the kind of model i’m arguing against

yep, spaCy is built on top of those sequence models :P it’s actually a really cool architecture that slightly gets away from the “everything is a sequence” thing: a sequence model/RNN produces a “summary” of the sentence, but those summary vectors then get used to make decisions about how to add each word to the tree structure you’re building. (And people put attention in here too of course.) But most of the processing still happens in the sequence model, with very generic rules to help ensure the tree ends up semi-grammatical.

That said, using syntax features is really useful and a lot of neural models do actually still do it for more complicated tasks. Giant neural stacks just sound cooler and get more press. (also they work better right now but I feel like that’s at least partly a byproduct of the hype which shifts research focus, not the cause)

@disconcision​ said: any engagement with the object-level is apparently considered cheating

I mean, there’s reasons for that! Grammar is complicated and has lots of exceptions, and it’s different for every language. Good parsers for English are the results of insane amounts of effort, both exhaustive-search-via-grad-student for the best techniques and vast amounts of linguistic annotation. If that effort hasn’t been put into another language—say Hindi—then your parser sucks and will put an upper bound on the accuracy of anything you try to do with it.

Yeah, that all makes sense.  I guess what really frustrates me is the current state of affairs for people (like me) who want to use these technologies to do things.

The vast majority of ~fancy neural~ stuff out there, both in available pre-trained models and even papers I read, is entirely end-to-end.  There are exceptions, like using Inception features as input to some other thing, but most of the time (certainly in the neural NLP stuff I know about) it seems like we treat every task as completely distinct and train it end-to-end.

This is fine if you want to do exactly what some group of researchers have already done with a neural model (although if they haven’t made pre-trained weights available, training data may be a problem), but usually you aren’t, and having so little freedom to compose anything is weird and frustrating.  I wish there was more interest in neural components that consume or produce things other than “end” input and output.  Kinda feels like a world with no APIs or libraries where we have to rewrite all functionality from scratch to make one product, and then again from scratch to make the next.

(ETA: I guess pretrained word embeddings are one exception, so that’s nice)

erratticusfinch:

“How can you say to your neighbor, ‘Go outside,’ while you are still extremely logged on? You hypocrite, first log off, and then you will see clearly to not be mad online.” - Matthew 7:4-5

(via prospitianescapee)

Automatic parsers for natural language are pretty good these days.  I use the spacy one all the time, and although it occasionally makes mistakes, it’s reliable enough that almost all of my parsing-related bugs come from code I put on top of it (or from ungrammatical input).

This makes me very curious why people don’t use them as components in deep learning architectures for text.  For neural machine translation, chatbots, etc., the popular models all use “attention” modules that emphasize certain parts of the (representation of the) input when producing each part of the output, or “self-attention” which a similar thing inside of the encoder and decoder (not between them).  This allows them to sort of learn how syntax works.  But everything is still tied to this idea of a sentence as a “sequence,” where you say “okay, I’m producing word #7, what information do I need to do that?”  

This is a weird question, because “word #7 in a sentence” is not a natural category, and the relevant information depends on what word #7 is doing syntactically, among other things.  (N. B. there are fancier positional encodings than just word #, but they’re all positional.)

If you’ve written the six words “I, who enjoy tasty food, will” then the next word is going to be a verb with word #6 as its auxiliary and word #1 as its subject, and words #2-5 are only relevant for semantic context.  OTOH if you’ve written “When choosing a restaurant, I usually” then the next word will be a verb with word #5 as its subject, word #6 as an adverb, and words #1-4 are only relevant for semantic context.  Etc.

It seems much more natural to have a decoder that makes a syntactic tree piece-by-piece, rather than a sequence, with the words in the tree ending up wherever they have to be.  Likewise, we could have the encoder to take a syntactic tree as input, and perhaps use tree-like structures for the latent representation.  This means we don’t have to learn grammar on top of the rest of the problem domain, it ensures grammatical output, and it gives us representations of long-range dependencies that don’t degrade as we insert arbitrary numbers of words (relative clauses, etc.) in between.  Since we have good automatic parsers, we can automatically make trees to feed to the encoder, and we can automatically make training data for the decoder even if we don’t have a hand-parsed corpus for the problem domain.

If I weren’t so busy I’d be trying this out myself (and probably running into all sorts of unexpected pitfalls, but that’s research for you).

nemfrog:
““Potomburi, Osaka.” Konen. The new woodcut. 1930.
”

nemfrog:

“Potomburi, Osaka.” Konen. The new woodcut. 1930. 

(via dharma-initiative-official)

“There were teams that had to work on shark stuff,” a former Fusion staffer said. “It was, ‘[Nico] is a genius and therefore we must make shark content because he wants it.’ ”