Any map depicting a phenomenon at a global scale is doomed to lack detail. Some subtleties that might be important at a finer spatial scale simply cannot be represented, or are not so relevant when we want to see the ‘big picture’. Unfortunately, that tends to be the case for world language maps: the Americas are very frequently represented with the classification proposed by Joseph Greenberg – either in the three large groups, Eskimo, Na-Dené and Amerind, or breaking them down into the lower-level macro-families (such as Equatorial-Tucanoan, Gê-Pano-Carib etc.). Now, it is easy to understand why this is done: just look at a linguistic map of Eurasia using a non-controversial classification (e.g. Turkic and Mongolic instead of Altaic) and you will see that a couple of families (a bit more than ten) cover the whole continent. If one is to do the same with the Americas, nearly 100 families are necessary. Put this side by side with the Old World and there will be a huge disparity in the map. We can create the illusion of balance between Old and New World by lumping the American families into larger groups, and that’s where Greenberg’s classification comes in handy. Even the venerable Cabridge Encyclopedia has adhered to that scheme in its maps, as have geneticists, archaeologists, and others when looking for a ‘minimalist’ classification of the American languages.
Needless to say, Greenberg’s scheme is not taken seriously by the majority of linguists working in the Americas. Greenberg built his reputation by proposing a classification of only a few families for the African languages, diminishing the previous myriad of families. However, even his African ‘families’ were based on shared features like classifiers and their genetic validity has been questioned (why couldn’t they result from areal diffusion, for instance?). The problems with the Amerind classification are much worse, as well summarised by Lyle Campbell in a number of papers. Here, I will only mention that the etymological dictionary is full of errors, weird segmentation and words from dubious sources (wouldn’t you expect so in a survey of hundreds of languages carried out by one man?) – but it must also be pointed out that Greenberg’s notebooks make clear that he had already devised his classification before looking at the evidence, which was then used just to fill up his preconceived scheme. Let’s look at the longest etymology, one that has been presented as the definitive proof of Amerind: *t’ina ~ *t’ana ~ *t’una “son / child / daughter”.
Reflexes of this purported root are found everywhere in the Americas. The ablaut i / a / u supposedly indicates male, neutral and female genders, and the fact that words with “i” tend to refer to male relatives and those in “u” to female ones was used as evidence of this system having been transmitted intact during the colonisation of the Americas. I like this etymology and I actually find it somewhat convincing. Unfortunately, there are many problems – as you can see in the few examples above, it seems that in terms of meaning anything goes, from son, to brother, to wife. This is typical of mass comparison. Combine this with the fact that anything like T-N, TS-N, Z-N etc. counts, and it will be relatively easy to find cognates.
The problems with Greenberg’s method should not discourage us from seeking deep relationships between the major American families. I believe many of them will ultimately be proven to be genetically related. One of the main criticisms to long-range comparisons is that very few, if any, cognates would have survived after so many millennia. Also, they would not be recognisable as such due to heavy semantic and phonetic shifts (English two and Armenian erku are usually cited). But is it so? What about French dent and Hindi daant? nom and naam? mort and mrta? Perhaps these languages have extremely conservative phonologies… but what about Greek δόντι [though it’s pronounced /ðodi/] and Russian мёртвый /mertvij/? All these words are separated by 7,000 km and at least six millennia, yet they are still recognisable as cognates and, in some cases, have almost identical phonology. My favourite example, however, is Afro-Asiatic: by the time the oldest languages in this family were recorded (Egyptian and Akkadian), over 4,000 years ago, they were already very different languages, so the proto-language must have been spoken several millennia before that, maybe 10,000 years before present or even more. Yet, we could still reconstruct proto-Afro-Asiatic based on languages spoken today (Arabic, Somali, Hausa…). So, why wouldn’t it be the same with Amerind?
Let me give one example of deep relationships that are becoming quite obvious. Language families that Greenberg and Ruhlen subsumed under the ‘Equatorial-Tucanoan’ and ‘Gê-Pano-Carib’ groups have long been thought to be related. There are a number of potential cognates between Macro-Jê and Tupi that seem quite convincing, some of which are shown below. It is extremely important to compare the forms in the oldest possible reconstruction, not between the individual modern languages of each family, otherwise false cognates may be identified or true cognates would not seem so close (as can be seen in the examples below). This is why many entries in the Amerind Etymological Dictionary are so embarassing to look at: for example, the words for ‘eye’ in three different branches of Macro-Jê appear scattered over three different etymologies (supposedly deriving from Proto-Amerind roots 248 *kad, 250 *ere, 252 *hin) when, in reality, we know that they all go back to the same proto-Macro-Jê root!
Even more intriguing than the shared vocabulary are some shared grammatical subtleties. For example, many Macro-Jê languages have a relativiser prefix: e.g. Suyá kʌt
irεyε y-aykwa ‘the boy’s mouth’ vs. s-aykwa ‘his mouth’. Exactly the same happens in the Tupi-Guarani languages, e.g. Tupinambá aβa r-oβa ‘the men’s face’ vs. s-oβa ‘his face’. These shared ‘irregularities’ are some of the most promising evidences of deep genetical relationships.
Furthermore, there are similarities in the pronoun system. Most Macro-Jê languages use a variation of ĩ- / a- / i- (1st, 2nd and 3rd person) as prefixes. Although modern Tupi-Guarani languages use a different set of prefixes (a- / εrε- / o-), when we go back to the proto-languages, we see some reconstructed pronoun prefixes that look a lot like Macro-Jê (such as PMG *uj- / *e- / *i-, or Karitiana
i– and a-). If we wanted to go really long-range in the comparisons, we could mention that the ĩ– / a- / i- series appears even in the Maya languages.
In summary, before changing topics, there is good evidence for a deep genetic relationship between Macro-Jê and Tupi, to which we must add the Karib family. The closeness of these three families has been advocated by the Brazilian linguist Aryon Rodrigues and by others working in the field. This means that Greenberg and Ruhlen’s ‘Gê-Pano-Karib’ should rather be something like ‘Jê-Tupi-Karib’. They erroneously consider Tupi as part of an ‘Equatorial’ group that includes Pano and Arawak (these families seem rather distant from the other three). I would say that the core idea of Amerind is right, i.e. that many of the large families of the Americas will be proven to be genetically related, but that it was substantiated by the wrong evidence. As the Jê-Tupi-Karib hypothesis has shown, work on the level of the proto-languages can render more reliable sets of cognates.
More on pronouns: N / M vs. M / T
Since I mentioned personal pronouns, it must be said that these were long thought to prove the genetic unity of the Amerind languages. Sapir, and later Greenberg and Nichols, already noticed that many Amerind languages used some form of n- for the 1st person and m- for the 2nd person. This would be present, for example (from North to South), in Sahaptin in / im, Yokuts na: / ma:, Nahuatl no- / mo-, Quechua noqa / qam and Mapudungun iñche / eymi. Interestingly, the n : m pattern is particularly frequent in the western halves of both North and South America, coinciding with other areal features. This is an interesting distribution, and it even appears as one of the thematic maps of WALS. We can contrast this with the pair m : t that is prevalent in Eurasia, as I mentioned in a previous post. There has been criticism of the pronoun mass comparison, especially because consonants like n, m or t are unmarked (easy to pronounce) and tend to appear as grammatical particles in virtually every language – so it would be easy to arrive at the n : m pattern by coincidence. Still, I don’t think that explains why, by sheer coincidence, American languages would have chosen this specific pair, whereas Eurasian ones would have preferred m : t.
Agents and Patients across the Americas
I would say that the n : m evidence is compelling, but even more interesting is the shared morphology of the American languages. In fact, it has been proposed that language structure changes more slowly than vocabulary and may provide better classification when the last fails. Attempts of classification using structural elements have even been made at a global scale. Of course, there is much diversity in the Americas, but let’s take as an example the languages in the eastern half of the continent – those we can describe as 1) more prefixing than suffixing and 2) with an ergative alignment when it comes to marking pronouns in the verbs:
The examples above encompass a few of the major language families of the Americas. Maya is not exactly large, but it is historically important, and Maya Glyphs are always nice to look at. However, it is very difficult to find glyphic examples with verbs in the 1st and 2nd persons. The sentence a-winak-e:n (“I am your man”) appears in a panel in Piedras Negras, pronounced by a member of the court to his sovereign. The sentences from a modern Maya language, Ch’ol, provide a fuller range of examples.
The intention of the comparison above is to show how personal pronouns are marked as 1) possessives (“my wife”); 2) subjects of intransitive verbs (“I came”); 3) agents or subjects of transitive verbs (“I saw it”); 4) patients or objects of transitive verbs (“you hit me“). As you can see above, the adjectives in many Amerind languages really function as intransitive verbs (“to be happy”, “to be hot”…). Even more interestingly, noun predicates (“I am your man”) can be constructed in the same way as intransitive verbs. In general, Amerind languages that follow the pattern above use two sets of pronoun affixes for marking possessives, subjects, agents and patients. Most often, the possessive, subject and patient are marked with the same prefixes, whereas the agent is marked differently (e.g. Lakhota, Tupinambá). In other cases, like Cree and Ashaninka, it is only the patient that is marked differently. Curiously, in these languages, as in Ch’ol (and Classic Maya), the patient is suffixed (-in, –on, –na “me”). This pattern is extremely important: as we will see, it is present in “islands” across Eurasia. I encourage you to explore the global distribution of such features with the World Atlas of Language Structures.