Centre for Linguistic History and Diversity

The Centre for Linguistic History and Diversity is an international research centre founded to develop a unified and integrated approach to language diversity driven by these three fundamental questions:

  • What is the potential scope of language diversity (in the combinatoric systems of language)?
  • How can the mechanisms of language variation through time be investigated?
  • How did the languages of the world actually diversify from their (pre-)historical ancestors?

Current partners beyond York are:

The initial core research programme began in January 2014.

Centre events

Other major projects affiliated with the Centre

Combining Gender and Classifiers in Natural Language

An AHRC project on gender and classifiers, involving collaboration with the University of Surrey. Genders and classifiers are two different types of system which do a similar thing, categorize nouns, and it is reasonable to assume that the would be mutually exclusive. If a language has a classifier system, we don't normally expect it to have a gender system, and similarly if it has a gender system, we don't normally expect it to have a classifier system. However, there are a few which have both ('dual categorization'). This project investigates what happens when languages have such dual systems and compares this with those which have only one such system, or none.

Endangered Complexity

A joint AHRC/ESRC project on the Oto-Manguean languages of Mexico involving collaboration with the University of Surrey. There are about 200 of these languages, and many of them are severely threatened or endangered. The Oto-Manguean languages have complex inflectional morphology (system of encoding grammatical information on words). They combine suffixes, prefixes, complex tonal patterns and stem alternations into many different inflectional classes. Understanding how the Oto-Manguean languages work provides important evidence as to the possible limits of inflectional complexity.

From Competing Theories to Fieldwork: The Challenge of an Extreme Agreement System

An AHRC-funded project on the Archi agreement system, involving collaboration with Essex, Harvard, and Surrey. The Nakh-Daghestanian language Archi provides a rich source of data on the interaction between morphology and syntax, particularly in relation to the role of both components in agreement. A wide variety of domains and constructions in Archi manifest agreement. This makes Archi particularly valuable language for investigating the mechanisms and constraints on this important part of the grammatical system.


LanGeLin (Language and Gene Lineages) is the acronym for the ERC-funded research project 'Meeting Darwin's last challenge: toward a global tree of human languages and genes', running from December 2012 to November 2017.

Morphological Complexity: Typology as a Tool for Delineating Cognitive Organization

This ERC-funded project is a comprehensive typological investigation of morphological complexity and involves collaboration with colleagues at Surrey and Brighton. Work at York focuses on two research strands. The first strand, Discovering Complexity, with Roger Evans (Brighton), concentrates on the machine learning of inflectional classes, where we investigate how much can be learned without building language-specific knowledge into the system. The second strand, with colleagues at Surrey, uses the Network Morphology theoretical framework to investigate defaults and irregularity in morphological systems.

The Oxford Corpus of Old Japanese

The Oxford Corpus of Old Japanese (OCOJ) is a long-term research project which will develop a comprehensive annotated digital corpus of all extant texts, with an associated dictionary and translations, from the Old Japanese period. This is the earliest attested stage of Japanese, from the Asuka and Nara periods of Japanese history (7-8th centuries AD), and the formative literate period of Japan. These texts are therefore of paramount importance for the study and understanding of the origins and development of civilization in Japan, including language, writing, literature, religion, history, and culture. The corpus is designed to support research in any of these areas.