<< Back


Over the years, I realised that papers in linguistics and the humanities that I find most satisfying are usually quantitative in nature. I also realised that most of my colleagues in these fields usually do not read and cite quantitative papers: those are often published on non-mainstream (at least from a certain perspective) platforms and use some college-grade mathematics, which the humanities and even linguistics people often find too complicated and/or intellectually non-rewarding.

I understand this. Interesting quantitative stuff almost by definition involves more or less non-trivial calculations (or at least some amount of probability theory, which is not included in most non-technical curricula), and it takes some actual love for mathematics in order to dig deeper and form a general intuition about the contents and validity of the argument; also one usually has to know how to program in order to reproduce the results or try the new methods on one’s own data. Moreover, these papers often include stupid assumptions and simplifications (bags of words, linear dependencies, nearly everything is a probability distribution, etc.), which one essentially has to habituate oneself to.

I get a weird kick out of explaining things and trying to make complex stuff look simpler. I also seem to know mathematics well enough not to get lost in equation 1, but not nearly well enough to take most of the stuff written in these papers for granted, which is how they are probably supposed to be read. I guess, this puts me in an advantageous position for a person who wants to proselytise the usefulness of the quantitative approach to doing linguistics and even the humanities without looking too aloof and distant from the plight of non-technical people.

In this blog, I will write about quantitative linguistics and humanities papers I find exciting in order to, first of all, check if I understand them correctly, but also to try and show why they could be of interest to folks in the fields these papers tread upon. I intend to first recover the argument of the paper by rephrasing it in more or less plain English (which is bound to be much more verbose than the original formulation) and then try to give some kind of half-baked evaluation.

In order to correct my stupid mistakes or to give any kind of feedback, feel free to write to mail /AT/ dnikolaev /DOT/ com or reach me on Twitter.