Language Isolates Part II (South America)

South America was once the least known continent, as hinted by the title of Jerry Moore’s book – which, as the author says, is a tribute to a previous work by ethnologist Patricia Lyon. If there is something I like about your study area being the “least known” in the field is that you can say pretty much anything and it will take some years until you can be proved wrong. And, if it appears that nothing makes sense, you can always say that research is only beginning and hope that one day the picture will become clearer.

Unfortunately, for South America, the archaeological and linguistic puzzle is still a difficult one. Take, for example, the relatively homogeneous Clovis horizon in North America and compare it with the absolute confusion in South America during the Late Pleistocene / Early Holocene. New dates for new (and old) sites keep springing up, pushing the dates back (definitely before Clovis) and showing that these early cultures were very different from each other (from Chile, through Argentina, to the Brazilian northeast).

The linguistic diversity in the continent goes hand in hand with that. By a quick count, there are more than 90 language families (including isolates) in South America. Europe (over half the area of South America) has only four. In the map below, I highlighted only those that were continentally widespread, covered large areas or had an undeniable historical relevance – and they are still more than ten. It is difficult to make sense of such diversity if we try to approach it with an “Eurasian” mindset – so much so that South America didn’t even figure in case studies of the farming/language dispersal hypothesis (the nice map on the cover notwithstanding!). A single South American example (Arawak) occupies no more than a paragraph in Jared Diamond’s Science paper. Brazilian archaeologist Eduardo Neves has been one of the few to scrutinise the South American mosaic of languages and cultures and try to explain why there were no continent-wide expansions as we see in Eurasia.

Major language families and language isolates of South America

The choice of the major families and the isolates to be included in the map above were not easy. Those that cover the most extensive territories and have the largest number of members are Arawak, Tupi and Macro-Je. They are also nice case studies because they are clearly associated with widespread archaeological traditions (at least in the case of the first two, and as “clear” as these things can be in Archaeology…) and preferred environments (forested river floodplains in the case of the first two, high altitude savannas for the later). All three expanded between 3,000 and 2,500 years ago given the glottochronological estimates (if we could rewind the figure 3,000 years back in time we would see a very different distribution of colours). Another large family is Karib, though it is pretty much restricted to the Guyana plateau area. Both Arawak and Karib expanded out of South America to the Antilles (these were the first people encountered by Columbus). Other language families, not especially widespread, were included because they are of particular anthropological or historical interest. For example, Pano-Tacana and Tucano groups play an important role in the regional systems of Western Amazonia. Chibcha was the family that connected South and Central America, besides being spoken by the Muisca chiefdoms of Colombia. The Guaykuru were fierce horsemen (after their introduction by the Spanish) and were one of the few non-state societies of the continent to maintain slaves. Needless to say, leaving Quechua and Aymara out would be a sacrilege to millennia of Andean civilisation (I will comment a bit more about those two later on).

Some of the large blank areas that you can see in the map are gaps due to lack of information. This can be seen in the Amazonas floodplain and in Eastern Brazil. The first case is a pity, because we would really like to know the languages spoken by the groups sharing the polychrome pottery style (see below)!

There are many language isolates in South America, and very small language families are also numerous. In the end, the definition of an isolate is debatable: for example, Japanese is not an isolate due to the existence of related languages in the Ryukyu islands, whereas Korean is usually considered an isolate despite being spoken over a pretty large area. Furthermore, as I mentioned before, language isolates are usually the last survivors of ancient language families. In any case, the problem with South America is the lack of documentation for many languages still spoken and the scarcity of historical attestation for many of the extinct ones. For that reason, it is difficult sometimes to decide whether we are talking about isolates or just “unclassified” languages – i.e. for which we don’t have enough materials to link them with one family or another. I decided not to include many languages that would fit that category, so the distribution in this map might look a bit different than what is found in the literature. A reader familiarised with the continent will no doubt miss the languages of the Brazilian northeast, such as Pankararu. As far as I know, the original languages of those groups have been lost, most of them now speak Portuguese, and apparently they spoke a Tupi-Guarani language (which might have been adopted as lingua franca) in the recent past.

The distribution of isolates (and small families) in South America does not present a very clear pattern at first, at least in comparison with Eurasia where they are all “pushed” towards mountainous terrain, islands, peninsulas and the outskirts of the continent. However, on a closer look, it seems that the headwaters in Northwestern and Southwestern Amazonia concentrate most of them (the position of Yagan at the tip of Patagonia, on the other hand, is more typical of an isolate that has been “pushed” or “cut” from later expansions). Of course Greg Urban already noticed this pattern of clustering in headwaters (in fact, we can think of them as friction zones), but I will argue below that some of these areas are also cradles of early cultural developments and high ethnic diversity in the past. First, let’s have a look at some of the isolates and their surrounding language families.

In northern Amazonia, a number of isolates and small families occur in the basin of the Orinoco – the largest river in those parts. It seems to me that, in the Amazon rainforest, some isolates are spoken by hunter-gatherer groups rather than farmers. That is the case with the Joti (3) of the Venezuelan savannas and the Maku (4), neighbours of the well-known Yanomami. The Waorani (5), who live in the forests of Ecuador, are another example. Since it was once thought that hunter-gatherers would be poorly adapted to the jungle, those groups have offered interesting insights on nomadic lifeways in the rainforest (e.g. Gustavo Politis’ book about the Nukak, a Maku people).

Another important isolate in the Amazon is the language of the Tikuna (7), a people known for their colourful art and beautiful polyphonic chants (you can hear them in this recording available through Smithsonian Folways). Tikuna is a tonal language, one of the few in the Amazon, which makes it all the more interesting – by the way, it has been suggested that tone is more likely to develop in tropical environments due to the effects of humidity on vocal cords. Whether that’s true or not, we have another tonal language as an isolate further south: the Guato (15), “Argonauts” of the Pantanal – the single largest wetland in the world. There have been attempts of linking Guato with the Macro-Je family, but it seems that the few cognates they share could also be used to argue for a relationship with other families (there’s a table further below on this post where you can see the similarity in basic vocabulary between many families).

North of the Pantanal, another large expanse of seasonally flooded savannas, known as the Llanos de Moxos, concentrates a myriad of isolated languages – Cayuvava, Itonama, Movima (9) – between the Andes and Amazonia. The region was also a hotspot of cultural diversity in the historical period and home to some very interesting archaeological finds. This region actually is part of a chain of isolates: as we go west, we find Leco (13) at the oriental flank of the Andes, and then we are just one step from the plateau around lake Titicaca where Uru, Chipaya and Puquina (13-14) were spoken.

Where South America ends: Tierra del Fuego was home to the southernmost isolated language of the continent, Yagan

Finally, we cannot forget Tierra del Fuego, where (quite predictably) we find an isolate, Yagan (16). I will now show how isolates can be typologically different from the languages that surround them (a theme that I touched upon in the introductory post) using the example of Puquina, and how they can be found in zones of high cultural diversity taking the Llanos de Moxos as an example.

Seven words in South American languages

Comparison of seven basic words in the major families of South America. The words are part of Swadesh’s list, but I have chosen them among the ones with the highest retention rates. The set I chose for the Basque example is different, as I selected seven words that were all perfect cognates between English, Spanish and Irish. Unlike that example, where I used the standard orthography of each language, I’m using here a very ‘relaxed’ IPA. I used y instead of j as is common in the notation of American languages. I tried to get rid of incorporated prefixes and suffixes that are so common in those languages (and which many times led Greenberg astray). For example, nearly all Piro body parts had a -tʃi suffix, and many Kadiweu words incorporated a final -gi or -di.

I can understand why Joseph Greenberg felt that all American languages (except Na-Dene and Eskimo) belonged in the same stock – they are very similar in the basic vocabulary (see the table above). I assembled the table so as to show a progression from the Amazon, through the Andes, to Patagonia. One does have the feeling that some potential cognates are shared within each area – check, for example, the word for “tongue” in all the right half of the table vs. those to the left. I suppose a lot of the similarities are due to areal features, as the remoteness of the relationship between the families would have obliterated obvious connections, but I still think the similarities are very interesting. They also exist in morphology and syntax. The Amazonian languages tend to be prefixing and show split ergativity; they can have the same series of prefixes for subject (or agent) and the possessive pronouns. Given that objects tend to be prefixed to the verbs, one interesting feature (called portmanteau) is that there is a special marker when the agent is the first person and the patient is the second.


In the example above, Tupinamba is quite typical in having a different 1st person prefix for the agent of a transitive verb (a-) and another for the subject of an intransitive verb (ʃe-). The last also marks the possessive pronoun and the patient of a transitive verb. I did not include any examples above, but this is also the marker of the subject of an adjective clause – i.e. ‘I am cold’ would be something like ‘[it] colds me’ (adjectives behave as verbs in Tupi and many other American languages; some Eurasian languages exhibit the same pattern, as I will comment in a future post). Finally, there is a special prefix (oro-) when the 1st person is the agent and the 2nd is the patient.

I have the impression that there are roughly two typological “blocks”, one Andean-Patagonian and the other Amazonian. Tupi, Macro-Je, Chibcha and Arawak (but not Pano and Tucano) are heavily prefixing in verbal morphology and for marking possessive pronouns, whereas Quechua, Aymara and Mapudungun tend to be on the opposite side, almost completely using suffixes. Other small families in the Andean sphere also tend to be suffixing, such as Mochica and Jivaro, although that probably doesn’t mean anything other than possible areal features.

This should make it easy to recognise outliers, but many isolates seem to borrow features from their neighbours. Let’s have a look at two isolates from Central Brazil, in the southern periphery of the Amazon and the Pantanal – Trumai and Guato – and see how their basic vocabulary compares with the largest families in their surroundings:


The reason why Guato has been linked to the Macro-Je family in the past is obvious. Trumai also shares a few possible cognates, but the interesting thing is that resemblances can also be found with nearly every other large South American family (see the table above). In terms of grammar, these two languages seem to be less agglutinative and more isolating than their Tupi and Macro-Je neighbours:


As you can see, Mebengokre (Macro-Je) uses different prefixes according to the function, which is typical of other lowland South American languages (check the Tupi and Karib examples above). In this case, i- marks the 1st person possessive, subject of a nominal predicate, and object of a transitive verb, but ba– is used for the 1st person subject of transitive and intransitive verbs (except in negative sentences). Trumai, on the other hand, uses ha for all those functions. Among the similarities, both languages display ergativity: in Mebengokre, this appears in the negative sentences, where -ye marks the agent. Trumai uses the suffix -k to mark the case, and also has a 3rd person suffix, -e, to mark the agent of an intransitive verb (absolutive).

Puquina and the secret language of the Incas

The next isolate we will look at, Puquina, is actually not too different from the Andean languages around it, Quechua and Aymara. It has a predominance of suffixes in the verbal morphology, and some of these, as well as part of the vocabulary (e.g. the negative ama) appear to be borrowed from the surrounding languages:

Puquina in comparison with Ashaninka (Arawak) and Quechua. The Quechua variety used in the examples is from Santiago del Estero. Puquina is extinct but has been documented in colonial Catechisms, hence the religious nature of many examples, taken from this analysis.

I might have cheated a little bit by using examples from Quechua, which is more familiar to me, instead of Aymara (the language that actally surrounds Puquina), but both are typologically very similar. Now, the devil lies in the details, and the prefixed possessive pronouns of Puquina are what really separate it from the Andean languages. As you can see, there are resemblances with the pronominal prefixes of the Arawak languages, which led to the hypothesis that they could be related, an interesting insight given that an origin of the Arawak family in western Amazonia is becoming more likely. In the remaining of its morphology, as can be seen in the other examples above, Puquina is actually very close to the Andean sphere, which might be due to centuries if not millennia of coexistence.

That naturally leads to the question of what Puquina really represents in terms of linguistic history of the Andes. Together with Uru and Chipaya, they are all spoken in Bolivian altiplano, surrounding Lake Titicaca, where the Tiwanaku civilisation emerged. But were any of these the languages spoken by those ancient people? The Andean case is one that really brings together linguistics and archaeology. For example, Aymara figures in most people’s imagination as the language of Tiwanaku, in the same way as Quechua is associated with all things Inca and thought to have been spread by them. In reality, Aymara only relatively recently came to the altiplano, and Quechua was already widespread when the Incas chose it as their language (the Inca élites were said to speak their own secret language… given their highland origin, this could be Aymara or even some language related to Uru, Chipaya and Puquina).

Was Aymara the language spread by the Chavín horizon?

That Quechua and Aymara were once thought to be related in a single family is understandable, taking into account the broad similarities, but we can be sure now that these are due to intensive language contact over the centuries. They were both widely used as linguae francae, but if who spread them, and when? There is still debate on that, but I am inclined to believe that Aymara was widespread before Quechua did (there are “islands” of Aymara in the Central Andes, typical enclaves surviving after later expansions), and that Quechua must have been disseminated by a cultural horizon before the Incas. Following Paul Heggarty, that leaves us (somewhat counter-intuitively) with Quechua as the language of the Wari-Tiwanaku period (around 1,500 years ago) whereas Aymara would have been spread as part of the Chavín sphere of influence ca. 3,000 years before present (in fact, Jaqaru, a branch of the Aymaran languages, is spoken in central Peru, pushing the potential homeland of the family further to the north). The logic of associating the expansion of Quechua and Aymara with the Wari-Tiwanaku and Chavín periods is that widespread language diffusion must have cultural counterparts – in this case, the best match are the geographically extensive horizons of Andean prehistory. If this is correct, Puquina and its isolate neighbours could be the remnants of even older times, perhaps going as far back as the Formative period – around 3,500 years before present – as part of the Chiripa culture. Genetically, modern remnants of the Uru appear to be quite isolate and divergent from other Andean populations, although with gene flow from posterior expansions. In any case, the isolated languages of the altiplano are witnesses to the long cultural history of the region.

Language diversity and cultural “hot spots”

So, perhaps South America does not conform to the typical Eurasian situation, where a few isolates are restricted to inaccessible regions and constitute linguistic “aberrations” in their respective zones, but we can identify some patterns. It seems that a lot of isolates in South America are found in areas of ancient cultural developments and enormous cultural diversity. We have just seen that in the case of Puquina, but let’s now go down the Andes towards the Amazon basin.

A cultural mosaic: example of a ring ditch (left) and platform ridges (right) in the Llanos de Moxos.

Here we find the Llanos de Moxos, a seasonally-flooded savanna that is roughly the size of England. The region concentrates quite a few language isolates, namely Cayuvava, Itonama and Movima (number 9 in the map). Archaeologically, we can divide the area into as many as seven distinct culture areas. For example, in some parts of the Moxos we find a “ring ditch” culture that built fortified villages in forest islands; in others, we find monumental mounds surrounded by other types of earthworks; in others, still, we have series of parallel raised fields used for cultivation. These diversified traditions of the Llanos were well established only some 2,000 years ago, but it seems that the cultural history in the region is a very old one. If you are interested in reading more about the Llanos, you should check Umberto Lombardo’s blog. We must also remember that the Brazilian Pantanal, where the isolate Guato is spoken, has a long history of human occupation. The same is true of southwestern Amazonia, especially the Guapore region in the state of Rondonia, where many isolates are found (e.g. Kanoe, Aikana). This area is home to some of the oldest evidences of agriculture and is thought to be the centre from which the Tupi languages spread, as well as possibly the Macro-Je (given some recent evidence of high diversity within the stock). In fact, the whole southern and southwestern periphery of the Amazon has a ‘belt’ of isolated languages and small families, indicating a high cultural diversity that, coincidentally or not, is paralleled by some of the most surprising findings in the region’s archaeology, such as the massive earthworks known as Geoglyphs in Brazil, and the fortified villages of the Upper Xingu.

The polyhcrome tradition spread over more than 2,000 km across the Amazonas floodplain, but was it associated with a single language?

I believe the LlanosPantanal, Guapore and other areas where several small families and many language isolates appear (see the map in the beginning of the post) are “cultural hot spots” where potentially such long archaeological sequences and a lot of ethnic diversity will be found. They are predominantly located in headwaters (this idea, of course, goes back to Greg Urban), show early cultural developments as seen through archaeology, and fit within the concept of ‘mosaic zones’. They contrast, for instance, with the huge expanses of Central Brazilian savannas, typical ‘spread zones’ (like the Eurasian steppe!) where an ocean of Macro-Je languages dominate. Another spread zone is the Amazon floodplain, a major waterway where, in prehistory, many material culture traits tended to be shared  – for example, the famous Amazon Polychrome Tradition. This tradition arose not long before the year A.D. 1,000 and was adopted for more than 2,000 km along the Amazonas floodplain, but we don’t know if that widespread ceramic style was associated with a single language. In other major river systems of lowland South America, such as the Paraná (and, in fact, all of the Atlantic coast of Brazil), similar ceramics (disseminated a lot earlier) were associated with Tupi speakers, but in the Amazon basin groups of different language families still make similar pottery, such as the Shipibo (Pano) or the Jivaro. On the other hand, some people have been associating the spread of polychrome ceramics with the Tupi languages even within the Amazonas floodplain. That’s a tempting idea, especially because of the historical evidence of the Omaguas and Kokamas, groups that were described in 16th century accounts as artisans of elaborately painted pottery. They spoke a Tupi language, but it seems that it is some sort of creole, using Tupi vocabulary with a different (possibly Arawak) syntax. That, of course, does not contradict the hypothesis that the language contact happened in prehistoric times.

