January 30, 2019

What is the difference between a language and a dialect?

The linguist Max Weinreich joked that “a language is a dialect with an army and a navy” – in other words, it’s only the trappings and endorsements of power that confer the exalted status of ‘language’ on one dialect and not another.

There’s some truth to this. Why is that we consider Swedish, Danish, and Norwegian to be separate languages (which are, to various degrees, mutually intelligible), but not to the ‘dialects’ spoken in different parts of Italy?

These days, it’s very fashionable to question the idea of dividing things between either/or categories. Such ‘binaries’, for instance the ‘gender binary’, are seen as being artificial, discriminatory and denying of the existence of a spectrum between the predetermined extremes. Especially problematic are those binaries which suggest a hierarchy of status – for instance the language/dialect binary, in which examples of the latter are considered to be non-standard, ‘provincial’ or ill-educated variants of the former.

Therefore, the counter-position – which is to regard all dialects as languages in their own right, or to see all languages as no more than privileged dialects – is very much in keeping with the intellectual spirit of the age.

But is it in keeping with reality? In a fascinating article for Aeon, Søren Wichmann writes about the research that supports a legitimate distinction between languages and dialects.

The starting point is a database of languages:

“In 2008, a number of linguists came together to form the Automated Similarity Judgment Program (ASJP), of which I am the daily curator and a founder. The ASJP painstakingly assembled a systematic, comparative dataset of languages that now contains 7,655 wordlists from what would be two-thirds of the world’s languages…”

The difference in the way that two languages express the same concepts can be quantified using a measure called the Levenshtein Distance:

” …named after Vladimir Levenshtein, a Soviet computer scientist who in 1965 devised an algorithm to compare two strings of symbols. He defined ‘distance’ as the number of substitutions, insertions and deletions needed to turn one string into the other. The Levenshtein distance can usefully be divided by the length of the longest of the two strings, because this puts all the distances on a scale from 0 to 1. This has become known as the normalised Levenshtein distance, or LDN.”

The researchers calculated the LDN values between different members of each ‘family’ of languages within their dataset. Unsurprisingly, they found that some relationships were closer than others. Crucially, however, the LDN values of the different pairs were most commonly found at either the high or low end of the distribution:

“…the distances tend to hover around either a relatively small value or a relatively large one, with a valley in between. As it turns out, the valley tends to lie in a narrow range around a mean of 0.48 LDN. Without losing significant precision, we can say that speech varieties tend to not be halfway similar in their basic vocabulary. Either they will tend to be more similar, in which case they can be defined as different dialects, or less similar, in which case they can be defined as different languages. Herein lies the distinction between language and dialect.”

There were some exceptions to the rule – for instance the LDN value between Swedish and Danish was found to be in the linguistic no man’s land between being dialects of one another and being clearly separate languages. (To complicate matters, intelligibility between the two tongues is asymmetric – Danes understand Swedes more than the other way round; indeed, their fellow Scandinavians sometimes joke that the Danes can’t even understand one another.)

On the whole, however, the language/dialect binary, while not absolute, does appear to have a basis in reality.

I wonder if this might the way we should think about about binaries more generally. We should be free to question their place in culture, but also free, where the evidence justifies such a conclusion, to accept their validity.

To make sense of the world it is necessary to make distinctions – to recognise that one thing is not like another. That should never be an excuse for prejudice, of course – nor for enforcing compliance with social norms when difference does no harm to others.

But, equally, when distinctions are tested by time and experience, the fact that they’re seldom absolute and without complication, does not mean that they’re completely bogus. Indeed, to pretend otherwise is itself a rather binary way of thinking.


