Essays on the history of writing II: The isolates of the Fertile Crescent

One can often find in the literature the idea that continents like South America or Africa differ fundamentally from Eurasia in terms of historical language expansions, and that one does not find in those continents a similar phenomenon to Indo-European, which covers nearly all of Europe and a considerable part of Asia. As I was thinking about that, and as I was writing the post about Mediterranean isolate languages, I realised something. If Indo-European has been expanding for the last 5,500 years or so, then it was only relatively recently in its history that it came to dominate all of Europe… or almost all, since Basque is still spoken today.

Etruscan was alive and well in (early) Roman times, as were the Rhaetic languages of the Alps and the unclassified languages of the Iberian Peninsula, including Aquitanian (probably related to Basque). Minoan probably continued to be spoken in Crete after the adoption of Greek, developing into the Eteocretan language. Not to mention the Lemnian and Eteocypriot languages (by the way, I completely forgot about the later in that post). And these are just the ones that left some written record. Who knows how many languages were spoken in Northern Europe at the same time? The point is: if we could map the distribution of language families in Europe 2,500 years ago, I am sure the map would be a lot more colourful.

Historical and modern isolated languages and small families in Mesopotamia, Anatolia and the Caucasus.

Of course, as I wrote before, there might have been one or two large families that expanded with the Neolithic in Europe. That is the nature of spread zones. But what about the area where agriculture originated? In line with what we saw in the Americas, areas of ancient cultural developments in the Old World should also be mosaics with high linguistic diversity. In the map above, I am showing some of the unclassified languages, isolated languages or small families of the Fertile Crescent (plus the languages of the Caucasus). Since Crete and Cyprus are still in the range of the map, maybe I should have included Minoan and Eteocypriot, but let’s focus on the Mesopotamia/Anatolia/Caucasus corridor for now.

There is no doubt that the cereals, pulses and animals that were the foundation of the Neolithic in Western Eurasia and Northern Africa were domesticated in this region, most likely in the highland Levant/Anatolia border. Was the spread of farming and pastoralism in the Fertile Crescent accompanied by the diffusion of a single language family, perhaps coinciding with the PPNA/PPNB cultures (or interaction spheres) that dominated much of the region between 11,500 and 8,000 years ago? I do not think so. In later times, most of the area shown in the map above would be covered by only two families: Indo-European (with Hittie and Luwian in Anatolia, Armenian in the Caucasus, and the Indo-Iranian languages to the East, including Persian) and Afro-Asiatic (namely Akkadian in Mesopotamia and all the other Semitic languages spoken from the Levant to Arabia). As in the European case, languages like Hattic, Urartian and Sumerian, that survived for some time while Indo-European and Afro-Asiatic were “conquering” the Near East, could be remnants of previous expansions, possibly since Neolithic times (although there are good Chalcholithic/Bronze Age candidates, like the Maykop culture in the Caucasus and the Ubaid culture in Mesopotamia…). But the fact is that they are too diverse to fit into a hypothetical single family.

Sumerian… and Euphratean?

The Royal Game of Ur. Like writing, board games were another Sumerian invention.

I have written enough about Sumerian in a few previous posts, so I will not elaborate on it anymore except for two crucial questions: the autochtonous origins of this mysterious language and the supposed Euphratean substratum. I have come across an extremely interesting paper that advanced the idea of Sumeria as a creole language born in the multicultural environment of Southern Mesopotamia. Among other things, the argument is that archaic cuneiform signs were constructed through a logic similar to modern creoles. For example, consider the signs sagka and gu7, meaning “head”, “mouth” and “ration” respectively. The last two are formed by slight modifications of the first. Specifically the last one is a combination of “head” and “bread”, and it can be argued that basic Sumerian nouns are very few, most of the others being derived in such way. However, the actual readings of those signs are sagka and gu respectively, so the creole-like derivation of nouns is mostly a phenomenon of the writing system, not of the language itself.

More interesting is the idea that an even earlier language preceded Sumerian(s) in Southern Mesopotamia. Many toponyms do not have a Sumerian etymology, and the phonetic reading of many cuneiform signs is of obscure origins. Most of the signs gained their phonetic values from the rebus principle, but some do not work that way. For example, the sign for “bird” muszen is read ḫu when used phonetically (instead of mušen), and the sign for “fish” ku is read ḫa (instead of ku). Some words, consisting of CV1C(C)V2C structure, do not fit the typical Sumerian structure. This is particularly the case of words for professions, like adgub adgub “reed weaver”, sipad sipad “shepherd” or engar engar “ploughman”. Many refer to farming or herding activities. Thus it is possible that Sumerians borrowed some specialised vocabulary from a society that was already well established in Southern Mesopotamia, but the theory remains speculative, and we know nothing else about this hypothetical language except for the few words in the alleged Sumerian substratum.


Linear Elamite

Spoken to the east of Mesopotamia, Elamite was the language of Susa, capital of a mighty state contemporary with the Sumerians, and later incorporated into the Persian empire. The Elamites got the idea of writing from the Sumerians, but developed their own “archaic” cuneiform signs that we cannot read phonetically (although we know something of the structure and possible content of the texts). Presumably, they recorded the same language as would be later written with the Assyrian syllabary.

“And thus says Darius, the king: Within these lands, whosoever was a friend have I protected…”

We have a large corpus for Elamite, and the language is understood relatively well. That is thanks to a number of bilingual inscriptions, the longest of which is the famous Behistun Inscription, where Darius I, king of Persia, announces his lineage and deeds in three languages: Persian, Elamite and Babylonian. Despite some efforts to connect Elamite with other language families, especially with the Dravidian languages of India, it remains an isolate. There are also interesting parallels with Afro-Asiatic, e.g. elti “eye” and kassu “horn” (although the later seems more like a Wanderwort). George Starostin has a paper with a review of such hypotheses, as well as a comparative 100-word list between Elamite and major language families.

Gutian and Kassite

Much less is known about the languages of two other neighbours of Mesopotamia, the Gutians and the Kassites, both of which inhabited the vicinity of the Zagros Mountains and invaded Mesopotamia to install their own dynasties of rulers. The Gutians reigned for a few generations ca. 2100-2000 BC after the collapse of the Akkadian empire. The names of theirs rulers are virtually all that is known of their language. Despite some comparisons with Tocharian (a very divergent Indo-European language once spoken near the border with China), the names of the Gutian kings mentioned in the Sumerian king list do not reflect any known family in the region. As for the Kassites, they conquered Mesopotamia after 1500 BC, and their language is marginally better attested, since a few Kassite-Akkadian glossaries were compiled by ancient scribes. Needless to say, the known Kassite words do not resemble Sumerian, Elamite, Akkadian, Hurrian or any other language of the region.

Above, some of the Gutian names from the Sumerian king list in the Weld-Blundell prism. Below, part of the Kassite-Akkadian glossary of the Hormuzd Rassam tablet.


“Argishti, son of Menua, built this temple and this fortress, called Irbuni…”

In the 15th century BC, Hurrian was the language of the powerful kingdom of Mitanni to the north of Mesopotamia. Together with the language of their neighbours, the kingdom of Urartu, near modern Armenia, they form the Hurro-Urartian family. Inscriptions in Hurrian and Urartian are written in the Assyrian cuneiform syllabary. They are found, for example, in the Fortress of Erebuni, modern Yerevan, announcing its foundation by the king Argishti I. The inscription names the fortress irbuuni Ir-bu-u-ni, which is in the origin of modern Armenian Երեւան Yerevan, a name that resisted 2800 years!

The most convincing proposed genetic affiliation of Hurro-Urartian is that it would be related to the Caucasian languages. The geographic location of Hurro-Urartian, its phonology, and a few cognates speak in favour of that connection. For example, Hurrian words consist predominantly of a (C)V(C) structure (pa- “to build”, a- “name”, un- “to come”, ar- “to give” etc.) adhering to the typical pattern of the Northern Caucasus. The shortness of the words is compensated by a seemingly complex consonant inventory. Caucasian languages are famous for having very few vowels, but 50 or 60 different consonants (remember this theory linking phonology and climate?). In line with that, it appears that the Assyrian cuneiform syllabary was ill-suited for rendering the Hurrian language. From what we can reconstruct, scribes had to resort to signs like pi pi and ip ip to represent –w-, -v-, -f and combinations thereof. Moreover, the personal pronouns bear a remarkable resemblance with the Northern Caucasian sets. Compare, for instance, Hurrian 1sg. iša-/šo– and 2sg. fe– with Kabardian 1sg. sa, 2sg. wa and 2pl. fa.

Hattic and Kaskian

The supporters of the Anatolian hypothesis of Indo-European origins forget that the Hittites were newcomers to the region. Their predecessors in Anatolia spoke an unrelated language, conventionally called Hattic. Very little is known of the language, as bits of it were only recorded by the Hittites. Evidently, the hypothesis of a Caucasian connection has been proposed by the Russian school. Comparisons with the hypothetical Sino-Caucasian family (including North Caucasian, Sino-Tibetan and Yenisseian) show some interesting cognates, but the validity of Sino-Caucasian is what is disputed in the first place. The Kaskian language later spoken in the northern coast of Anatolia was presumably related to Hattic, and could be the language of the descendants of the first settlers dislodged by the Hittites.

Deep connections? Not quite

As I was first planning this post, I believed there should be some deep relationship between all the “colours” in the map at the beginning of the post. That was the logical conclusion: these were islands of (a) previous expansion(s) later blurred by Indo-European and Afro-Asiatic. Reality is not so simple: let’s have a look at the basic vocabulary of some of those languages:

* The reconstructions in bold are Starostin’s Proto-North-Caucasian. The ones not in bold are reconstructed for the Proto-Northeast-Caucasian level only.

A few isolated resemblances can be found here and there (e.g. “eye” in Elamite and Proto-North-Caucasian, “tongue” in Sumerian and Proto-North-Caucasian), but these might as well be due to chance. The words for “horn” in Elamite, Hattic and Caucasian might be related. To that we must add Proto-Indo-European *k’era(w)- and Proto-Afro-Asiatic *ḳar-. This means the resemblance in the table above is not unique of those languages, and we might in fact be dealing with a Wanderwort.

Perhaps we should not give so much weight to vocabulary. As I explained previously, there are some intriguing morphological similarities between various Eurasian isolates located thousands of kilometres apart. Curiously enough, the same does not apply to this group of relatively close languages in terms of geographical distance. Let’s review the crucial features that distinguish Eurasian isolates in opposition to the large families that surround them: 1. a predominance of prefixes; 2. ergative alignment when marking the pronouns in the verb; 3. possessive pronouns identical to one of the sets used with the verbs; and 4. complex “chains” preceding the verb root. In the table below, I show a quick comparison between the Mesopotamian-Anatolian isolates and Kabardian, a good representative of the Northwest Caucasian family.


Personal pronoun prefixes are marked in red, and suffixes in blue, as usual. Kabardian is the only language that actually conforms perfectly to the aforementioned pattern. Sumerian comes close, especially in relation to the verbal chain, but even it employs a good number of suffixes. Elamite is not even an ergative language, and Hurrian is as prolific in the use of suffixes as a Turkic or Uralic language. Finally, there is almost no resemblance in the actual pronoun particles, except perhaps between Hurrian and the Caucasian languages (more evident with the independent pronouns, as I said above).

What conclusions can we draw from all this? I would like to end with a very simple idea: before the Bronze Age, when large scale warfare – propelled by better weapons but also by the horse – became a major factor in population expansions, the Fertile Crescent was a linguistic mosaic with higher population densities than its surroundings and a long history of ancient cultural innovations, including agriculture. Languages expanded from it, not into it. It is pointless to look for the origins of Indo-European in the first farmers of Anatolia, who spoke Hattic before the Hittites arrived. Neither could the Neolithic peoples of the Levant have spoken Afro-Asiatic, the only Eurasian branch of which (Semitic) having reached the area in relatively recent times. We will never know the language of PPNA, and there might have been many. Perhaps the dwellers of Çatal Höyuk and worshippers at Göbekli Tepe spoke an ancestor of Hattic, or perhaps it was yet another language that contributed to the huge diversity of this ancient cultural mosaic.


Eurasia-America connections

In the previous post, when I commented about the Amerind hypothesis, I called attention to the fact that most languages in the Americas have similar structures. For example, except for some families clustered on the Pacific side of the continent, they are characterised by: 1) a tendency to use prefixes instead of suffixes; 2) a split ergative alignment for marking the persons in the verbs; and 3) two sets of verbal pronoun prefixes, one of which also functions as possessives. Although language structure seems to be more conservative than vocabulary, providing a good estimate of genetic relationships, these features admittedly could have been diffused over several millennia, or arrived at independently. Or could they? I believe the widespread shared morphology of the American languages is no coincidence, because in Eurasia this pattern is the exception rather than the rule. It can be found in ‘islands’ across the continent, coinciding with many language isolates and small families.

The Islands of Eurasia

Languages in red share similar structures. Unlike most of the widely dispersed Eurasian families (Indo-European, Uralic, Turkic etc.), the isolates and small families pinpointed in the map share many characteristics found in the American languages. Could this be an ancient pattern in the Old World that was wiped out by later language spreads?

Among the languages that preserve similar structures to the Amerind languages are Basque (but only in vestigial form), the several Caucasian families, Ket, Burushaski, Kusunda, Ainu, and the Chukchi-Kamchatka languages. In the past, Sumerian exhibited similar features. Let us look at some examples in these languages:


As in the Amerind cases, of course, most of these languages also make use of suffixes to varying degrees. It is very suggestive that Chukchi, which is geographically closer to the Americas, shows the same pattern that we saw in the previous post in some Amerind languages: in the transitive verbs, prefixes mark the agent and suffixes mark the patient; suffixes are also used to mark the subject of intransitive verbs, or adjectives in this case (e.g. “you were quick” – remember how adjectives can function as verbs in the Amerind languages?). Curiously, Basque and Burushaski have the inverse situation: prefixes for objects, suffixes for subjects. In all cases, except Chukchi and Basque, the prefix set is also used for marking possessive pronouns.

The ‘flexion’ of the word asasarame in Linear A.

I have not included Sumerian examples above because this language will be treated separately below – as is appropriate for the oldest language recorded in writing by mankind. Before that, I would like to point out that not all isolates or small families (living or extinct) in Eurasia follow the ‘Amerind’ pattern. There is no evidence, for example, that Etruscan was predominantly prefixing. I did, however, include Minoan (the language spelled in Linear A) – due to the alternation of a/ja in the a-sa-sa-ra-me paradigm. Such paradigms are similar to those noticed in Linear B by Kober and that were so fundamental for the decipherment by Ventris. A-sa-sa-ra-me occurs frequently in libation formulae written in Linear A and, if this is indeed a word that can be ‘infected’, would receive the prefix j- and the suffix -ana. This is weak evidence, of course, especially given that the word is interpreted as the name of a goddess (not a verb, for example). Moreover, it should not be discarded that Minoan was an Afro-Asiatic language (where affixes like j- and -ana would perfectly pass), as the reading of kuro-01 ku-ro as ‘all’ would support, e.g. Akkadian kalu-01 kalû, Arabic kull-01 kull (but another word of caution here: it does not seem that the word can be reconstructed for Proto-Afro-Asiatic). Although Afro-Asiatic is a geographically vast language family, most of its branches are restricted to North Africa, and quite understandably it does not exhibit the typical Eurasian structures. All in all, I leave Minoan here as a curiosity.

The Nature of Sumerian

Let us briefly examine the grammar of Sumerian with respect to the two main points reviewed above: personal pronoun affixes and verb conjugation. First, unlike all of the previous examples, Sumerian possessive pronouns are suffixed, rather than prefixed to their nouns. The first and second persons sg. are respectively -ĝu and -zu, whereas the first person pl. is marked as -me. These have interesting parallels in Eurasia (m : z could be related to the m : t set, and ĝu, if pronounced /ŋu/, would even have a Sino-Tibetan possessives-01parallel), but probably are superficial resemblances. The third person is marked -ani if animate and -bi if inanimate.

In the examples, I am using both cuneiform examples and earlier, more linear monumental -style signs, as they were compiled from different sources (this will be covered in the future if in a series of posts about the development of writing).

It is in the field of verb conjugation that Sumerian becomes really interesting – and where it shows some resemblance to languages like Ket or the Na-Dené family. The feature that distinguishes these languages is usually called the ‘verbal chain’, which is nothing more than a sequence of affixes both preceding and following the verb root. In the case of Sumerian, the affixes of the verbal chain convey information not restricted to the agent and patient of the action, but also cross-referencing other components of the sentence in different cases.

Let us take, for example, what is probably the first verbal construction encountered by the student of Sumerian: mu-na-DU3 ‘he has built’ (Hayes’ manual has a good share of “munadus” in the first chapters!). This appears in a number of dedicatory stelae stating how a temple was built by a king for some deity (E2-a-ni mu-na-DU3 ‘his house he has built’, see E2-a-ni in the previous figure). Such simple word actually conveys a lot of information. verbs-01First, the prefix mu- is of uncertain meaning, but is one of the mandatory conjugation prefixes, used before case cross-referencing. I.e., there are a number of affixes that reference previous words in different cases: in this case, -na- means that one of the arguments of the sentence is in the dative case (the full sentence would be ‘to him he has built’). Finally, the -n- is marking the 3rd person animate subject (though it is frequently omitted in the cuneiform). In this example and the others included in the figure above, I followed my usual scheme of highlighting the prefixes in red and suffixes in blue.

If the same verb was in the first person, it would be mu-DU3-en ‘I have built’ – that is because the 1st person is marked as suffix in the verbal chain. This is illustrated by another example above, ma-ra-DU3-e(n) ‘I shall build’. The elements are the same, except that -mu- changes the vowel due to harmony with the following syllable, -ra-. This, again, cross-references the dative (after all, I shall build your house for you). A last example with the same verb illustrates the nominalisation with -a: i-n-DU3-a ‘he who has built’. The conjugation prefix in this situation is ĩ-, not mu-, since there is no case cross-referencingThis prefix is also found in the next example, for which I chose an intransitive verb: im-ma-ĝen ‘he went’ (underlying ĩ-ba- affected by nasalisation). The prefix -ba- is in complementary distribution with mu-, referring to inanimate subjects.

The last example in the figure above shows a number of affixes in a relatively complex sentence, nu-mu-e-SUM-mu-un-ze-en /nu-mu-e-sum-enzen/ ‘you have not given it to me’. The first prefix, nu-, is the negation, followed by the now familiar -mu-. Before and after the root SUM ‘to give’, we find -e- and -enzen marking the 2nd person plural, whereas the 1st person is not really referenced.

Conclusion: on long-range comparison

The Sumerian verb chain is not typical of the ‘spread zone’ Eurasian language families – Indo-European, Uralic, Turkic etc. It does, however, have parallels among the ‘residual’ families or isolates: in the figure at the beginning of this post you can see some similarities forms in Ket and Adyghe. Does that mean that those languages are related? Not at all (well, in a way, I believe that all languages are genetically related; the question is whether the relationship is recent enough to be demonstrable). Basque and Sumerian have already been the victims of too many unlikely comparisons. On the other hand, some of the isolates in the map above have indeed been suggested by serious linguists to be related to languages far, far away.

Kusunda, for example, has been hypothesised by Merritt Ruhlen and others to be related to languages of Papua New Guinea, which is not entirely absurd given the genetic ties of southeast Asia with Melanesia. The linguistic evidence is based on pronominal sets and a few vocabulary items. Although the first are indeed suggestive (especially as pronouns tend to be retained longer than vocabulary), the lexical evidence seems to have been assembled in the typical ‘look alike’ fashion that we find, for instance, in the Amerind etymologies. If Kusunda is related to the ‘Indo-Pacific’ languages, this should be an extremely ancient relationship, over 50,000 years old… one wonders how such relationship would still be noticeable today when the languages of Papua New Guinea themselves defy a classification into less than 30 families or so!

The Yeniseian family, of which Ket is the last remnant, has on its turn been proposed to be related to the Na-Dené family of North America. Although the idea is not new (check this 1998 PNAS paper by Ruhlen on the subject), it is in the form proposed by Edward Vajda that it has recently received some acceptance (it was reviewed by Jared Diamond in Nature). The linguistic evidence is mainly based on the resemblance of the ‘verbal chain’ and other shared paradigms in Yeniseian and Na-Dené, but also in a small but significant part of the vocabulary, which shows regular sound correspondences even in items of the basic lexicon. Non-linguistic evidence from genetics is weak and, I must say, archaeologically the hypothesis lacks a clear correlate: the ones that have been presented, such as the interaction with the arctic small tool tradition, fail to convince me (though I am no specialist in the Archaeology of North America, not to say Siberia).

Whether or not we accept those extra-continental relationships, what must be clear is that the common patterns found in the ‘islands’ of Eurasia does not prove genetic relationship between them, but possibly shows a typology that was widespread – maybe through continuous interaction or ‘punctuated equilibrium’ – before the expansions of the major families of the continent. The language(s) of the first (and later?) migrants to the New World shared those features, and that is why they are so common in the Americas, whereas in Eurasia they were wiped out by later language spreads.

An illusion? The dangers of convergence

As usual, in this final note, let me play the Devil’s advocate: it is quite possible that the shared morphology of the languages analysed here simply developed independently over time. This is not just a hypothesis, but a plain fact in the history of some languages: Egyptian, for example, was an agglutinating language with suffixes for the verbs (in most tenses) during its ‘classic’ period, Middle Egyptian. Thus, the verb ‘to hear’ in the perfect was conjugated sdmnf-01 sdm.n=f (probably pronounced /sadímnaf/) ‘he heard’, where -n- marks the perfect and -f is the 3rd person masculine. However, in the later stages of the language, this form was replaced by the use of an auxiliary verb (as it happened in a number of western European languages – I have heard, ich habe gehört, yo escuchado…) from the verb ‘to do’. Thus, we have late Egyptian jrfsdm-01 jr=f sd‘he heard’ (literally ‘he did a hearing’). Finally, in Coptic, the latest stage of Egyptian, the auxiliary became bound to the root of the verb, creating a sort of ‘verbal chain’ – afswtm-01 ‘he heard’ (a-f-sôtm PST-3sg-hear). If this whole cycle happened over some 4,000 years in the history of the Egyptian language, why would the morphology of Sumerian, Ket, Adyghe etc. have remained intact over dozens of thousands of years?

Language Isolates Part III (Eurasia)

This post is going to be a jump across the Atlantic (or Pacific?) from the American case studies, but I intend to go back to North America shortly. For now, it is time to have a look at Eurasia. This is the continent where most models were created. Historical linguistics as a discipline itself was born from the similarities noticed between Sanskrit and its European classical relatives. The farming-language dispersal hypothesis was based on the Indo-European spread with the Neolithic from Anatolia into Europe (alternative models, such as elite dominance, were equally developed to account for that same expansion, this time from the Pontic steppe during the Bronze Age). Finally, Johanna Nichols’ distinction between spread and friction zones is derived from the Eurasian language distribution. Perhaps a good exercise would be to try to think of Eurasia as a (giant) exception rather than the rule in terms of language distribution.

Major language families and isolates of Eurasia

First of all, a very small number of language families (less than ten) accounts for almost all of the nearly 55 million km2 of the continent. Indo-European alone can be found at all the extremes (north, south, east and west) of Eurasia. The number of language isolates and small families is reduced when compared to the Americas. Some of the isolates shown on the map above are no longer spoken, but we are fortunate enough to have written records of them (in fact, for Sumerian, we have the earliest written records). Isolates in Eurasia appear in all parts of the continent, but prevail in mountainous areas (Basque, Burushaski, Kusunda) and the extremities of the continent (Nivkh, Ainu).

I tried to be inclusive in the map above, showing historical isolates together with living ones. Vedda, in Sri Lanka, is notably absent, since the language of those hunter-gatherers is based on Sinhalese with an unknown substratum (i.e. there is really no evidence about their original language). Korean is technically an isolate, but I didn’t find it fair to include it side-by-side with, let’s say, Burushaski, given the large area where the first is spoken (my criteria are a bit arbitrary, I admit).

Now, let’s briefly review the situation of Eurasia. From west to east: Basque is probably the remnant of a Neolithic language family once spoken in western Europe (I have briefly commented on that on my first post about language isolates). The other major isolate of mainland Europe is Etruscan, the cultural antecedents of the Romans from whose language Latin borrowed so many words (even frequent ones that made it to English, like person). Etruscan could have been part of a hypothetical family called Tyrsenian, together with languages once spoken in the Alpine region.

I have included Minoan, the language of the Bronze Age palaces of Crete before Greek was introduced. Although there have been attempts of connecting it to the Semitic languages, it is still undeciphered, and fits very well with the typical isolate setting (it was spoken in an island, after all!). It could have survived until the classical Antiquity, as there are in the island a few inscriptions in Greek alphabet rendering a mysterious language called Eteocretan. Not far away, the region of Anatolia and Mesopotamia, cradle of civilisation and home to astonishing developments since the Neolithic, was itself home to quite a few isolates. I shall comment a bit more about Sumerian below.

Knossos, Crete. The island was home to a language isolate, Minoan, until Greek was introduced and borrowed the native syllabic script (Linear A).

In the Himalayas and surroundings we have Burushaski, nowadays very geographically restricted to two river valleys of northern Pakistan (where the armies of Alexander the Great once marched), and Kusunda, a language still spoken in Nepal but until recently believed to be extinct. Ket, spoken in the Yenisei valley of Siberia, was once part of the Yeniseian family – and now famous due to the supposed demonstration by Edward Vajda of the Dene-Yeniseian hypothesis, that connects it to languages spoken in North America. Finally, at the easternmost shores of the continent, Ainu is most likely another Neolithic survivor, descending from the languages of the Jomon culture later pushed to the north when Japanese was introduced from the mainland.

All these languages are very different from the more widespread families that surround them, so let us do the same exercise as in the South American case from a previous post and have a look at some basic vocabulary.

Seven words in Eurasian languages

Comparison of seven basic words in the major language families of Eurasia. The words are the same as in the Basque example from the first post and have a high retention rate in the Indo-European languages. Instead of IPA, I decided to use the standard spelling or transliteration for each language.

The similarities are more evident in the left half of the table (compare name in Indo-European and Uralic, or tongue in Uralic and Mongolic). Nevertheless, all families with the exception of Sino-Tibetan are part of the Nostratic proposal of Illich-Svytich and, later, Dogopolsky and Bomhard. Another famous shared word that could be quoted is water (from PIE *wodr-) and Finnish vesi (from PU *wete). Both this supposed cognate and the one for ‘name’, however, might well be ancient loanwords. The basis for the Nostratic hypothesis, of course, is not just shared vocabulary, but some significant similarities in the pronoun system and inflectional morphology.

My favourite Nostratic etymology (Bomhard’s reconstruction is given in parentheses after Dolgopolsky’s). The word for ‘body of water’ is found in a ‘belt’ across Eurasia.

For example, the first and second persons in the Indo-European and Uralic languages are quite similar: compare Spanish me / te (acc.) and Finnish minä / sinä (sg.) or even me / te (pl.). The resemblance can also be noted in the verb conjugation: compare modern Greek ξέρ-ουμε (we know), ξέρ-ετε (you [pl.] know) and Finnish puhu-mme (we speak), puhu-tte (you [pl.] speak). To that, we might add Turkish geli-yor-um (I come), geli-yor-sun (you come). Thus, the pair m : t (1st, 2nd person) appears in independent pronouns and as verb suffixes. Curiously, a parallel (n : m) was also proposed as being almost ubiquitous among the Amerind languages (since I did not write about that in the South American posts, I might have to wait for a future post on the New World). Finally, it is worth mentioning that the most widespread Eurasian families are heavily suffixing (e.g. Turkish inan-ma-d[ı]-ın.ız [] ‘you did not believe’), a contrast with the typically prefixing American pattern.

Although the validity of Nostratic (or Greenberg’s ‘Eurasian’, an even more inclusive macro-family encompassing, for example, Eskimo-Aleut) as a genetic grouping has been highly questioned, I like to see all the ‘clues’ above as indicating at least a shared history of contact between some language families (e.g. those of the Nostratic proposal) at the expense of others (e.g. Sino-Tibetan). I also like to think of the Amerind evidence as pointing to pretty much the same scenario (another, more ‘extreme’ example may be Australia – if Dixon’s theory of punctuated equilibrium is right, millennia of convergence of previously unrelated languages could create the illusion of genetic relatedness).

Unlike some of the South American cases, the isolates in Eurasia are distinguished not only by vocabulary, but also by generally exhibiting an unusual morphology, clearly standing out in relation to the languages that surround them. I have briefly mentioned a few examples of Basque, including a compared basic vocabulary, in the introductory post, but I would like to be more specific now. Let’s see how the structure of Basque compares to Spanish with a few sentences rendered in both languages:


Unlike Spanish, Basque marks the definite with a suffix (-a). Furthermore, unlike most European languages, Basque is an ergative language, marking the agent of a transitive verb (e.g. ‘to see’) with the suffix -k, and leaving the object or the subject of an intransitive verb (‘to come’) unmarked. Finally, the verb morphology is quite different from Spanish and most European languages, with affixes referencing both agent and patient (n- ‘me’). As can be seen in the sentences above, English and Spanish are, in contrast with Basque, almost identical in structure.

Jomon and the Ainu

Let us move to another example, Ainu, which is in a similar situation to Basque. Ainu was probably part of a wider family of languages that once extended throughout Japan. The archaeological correlate of that is Jomon, a Mesolithic culture that developed one of the oldest ceramics of the world (ca. 12,000 years ago) even before the advent of farming (several other ancient ceramics appear associated with fishing cultures in Asia). Later, around 2,300 years before present, the Yayoi culture brought Bronze artefacts with Chinese-inspired motifs to Japan, together with rice agriculture – and, presumably, the Japanese language – pushing the Jomon to the north. Now, the Ainu, descendants of the Jomon, live in Hokkaido and the Sakhalin island, and are genetically quite distinct form the Japanese, although a lot of admixture has been noticed.


There is not much resemblance except maybe for the numerals in Ainu and Korean (these are, like in Japanese, the ‘native’ numerals and not the Chinese loans that were adopted by both languages). Both Korean and Japanese follow the typical Eurasian pattern, making heavy use of suffixes. This is evident in the verbal system, e.g. in the degrees of politeness and honorifics (Japanese tabe-ta / tabe-mashi-ta ‘[I] ate’; Korean meok-da / meok-seumni-da ‘[I] eat’). As can be seen in the Japanese examples above, suffixes are added for a variety of meanings, and the persons are not really marked in the verb. In contrast, Ainu has prefixes both for agents and patients (k- 1st person, echi- 2nd person).

EME-GIR (Sumerian)

A couple of examples of Sumerian, illustrating the verb chain and ergativity. First example is from the story ‘the dream of Dumuzi’.

The final example I want to discuss is Sumerian, which, like Basque, also displays ergativity. The verb morphology is complex. A chain of affixes precedes and follows the verbal root, with conjugation prefixes and cross-referencing of subject and object. See the examples on the right: mu- and ga– are conjugation prefixes of debated meaning. Both agent (-e- 2sg) and patient (-b- 3rd person inanimate) can be marked on the verb. There are even prefixes such as -na- to indicate that some previous word is in the dative case. The same can be said of Ket and Burushaski. I will comment on these interesting parallelisms in the final (?) post about language isolates, after dealing with North America and writing a bit more about the ‘Amerind’ hypothesis. For now, I will conclude with some remarks on the geographical situation of isolates in Eurasia and their cultural and archaeological significance. Before that, just because Sumerian is an extremely interesting language, here is an extended word list. Can you find any similarities with other language families?

Twenty words in Sumerian


One of the peculiarities of Sumerian is that it was largely a monosyllabic language. We don’t know whether a tone system was in place, as in Mandarin or other monosyllabic languages, but homophonous words are now distinguished in the transliteration by numbers subscript to the word. To the list above, we should add that the same sign and word for ‘blood’ 2 also means ‘to die’. The sign for ‘nose’ is the same, though pronounced differently, for ‘mouth’ (ka) and ‘tooth’ (zu2). The cuneiform for ‘tongue’ eme consists in the same sign with a little extra stroke, in the same way as ‘moon’ itud is a modified sign for ‘sun’. I will comment on how these came to be in a future post about writing and its origins around the world.


Language diversity in the cradle of civilisation

I have commented in the past about how South American isolates tend to cluster in areas of high archaeological diversity and very ancient developments. I also compared these areas to ‘mosaic zones’, where the origins of some widespread language families can be found. Interestingly, something similar happens in Eurasia.

First, needless to say, there are a number of historical isolates in the fertile crescent and its neighbourhood. Minoan, Hattic, Sumerian and Elamite were all spoken in relative proximity, and are all unrelated to each other and to any other family (tentative classifications notwithstanding). There is no doubt that this is one of the oldest centres of origin of farming in the world, as well as having been home to the earliest urbanism and writing in Eurasia, both with the Sumerians around 5,000 years ago.

Everything in Çatal Höyük is fascinating, a unique window into our deep past when sedentary village life had just begun. This mural depicts the world’s oldest map: a volcano (the double-peaked mountain in the background) overshadows the settlement (squarish compartmented little houses).

Second, long before that, Anatolia (where Hattic, the isolated language that preceded the Indo-European Hittite, was spoken) witnessed the emergence of some of the largest Neolithic settlements in prehistory, such as the famous Çatal Höyük, around 9,000 years before present. Third, it is worthy of notice that, even before the Neolithic, this region saw the construction of what is possibly the world’s oldest temple – Göbekli Tepe, also in Turkey – complete with a circle of stones engraved with various animals.

Finally, the nature of this mosaic zone is confirmed by linguistic evidence alone: the origins of many language families can be postulated to lie somewhere in the fertile crescent and its surroundings. This is the case of Indo-European, if the Anatolian hypothesis is correct (by the way, I have not written about the Indo-European problem here, as I am planning a future series of posts on mosaic zones, spread zones, and language expansion), as well as Afro-Asiatic and possibly Dravidian and Altaic. This, at least, is the situation envisioned by Colin Renfrew in his introduction to Dolgopolsky, and fits particularly well with the Nostratic proposal that implies that all those families were in geographical proximity in the past.