Accessibility statement

Centre for Linguistic History and Diversity

The Centre for Linguistic History and Diversity is an international research centre founded to develop a unified and integrated approach to language diversity driven by these three fundamental questions:

  • What is the potential scope of language diversity (in the combinatoric systems of language)?
  • How can the mechanisms of language variation through time be investigated?
  • How did the languages of the world actually diversify from their (pre-)historical ancestors?

Current partners beyond York are:

The initial core research programme began in January 2014.

Centre events

Major projects affiliated with the Centre

Feast & Famine: Confronting overabundance and defectivity in language

How do people acquire and make sense of ‘messy’ linguistic data, when there are too many or too few forms available?

Our international team is examining this question from multiple angles using data from the languages of central and eastern Europe. Funded by the UK’s Arts and Humanities Research Council, this project will investigate two puzzling language phenomena as reflected in a variety of linguistic data and how we describe them when writing reference works for public use.

Combining Gender and Classifiers in Natural Language

An AHRC project on gender and classifiers, involving collaboration with the University of Surrey. Genders and classifiers are two different types of system which do a similar thing, categorize nouns, and it is reasonable to assume that the would be mutually exclusive. If a language has a classifier system, we don't normally expect it to have a gender system, and similarly if it has a gender system, we don't normally expect it to have a classifier system. However, there are a few which have both ('dual categorization'). This project investigates what happens when languages have such dual systems and compares this with those which have only one such system, or none.

Endangered Complexity

A joint AHRC/ESRC project on the Oto-Manguean languages of Mexico involving collaboration with the University of Surrey. There are about 200 of these languages, and many of them are severely threatened or endangered. The Oto-Manguean languages have complex inflectional morphology (system of encoding grammatical information on words). They combine suffixes, prefixes, complex tonal patterns and stem alternations into many different inflectional classes. Understanding how the Oto-Manguean languages work provides important evidence as to the possible limits of inflectional complexity.

From Competing Theories to Fieldwork: The Challenge of an Extreme Agreement System

An AHRC-funded project on the Archi agreement system, involving collaboration with Essex, Harvard, and Surrey. The Nakh-Daghestanian language Archi provides a rich source of data on the interaction between morphology and syntax, particularly in relation to the role of both components in agreement. A wide variety of domains and constructions in Archi manifest agreement. This makes Archi particularly valuable language for investigating the mechanisms and constraints on this important part of the grammatical system.


LanGeLin (Language and Gene Lineages) is the acronym for the ERC-funded research project 'Meeting Darwin's last challenge: toward a global tree of human languages and genes', running from December 2012 to November 2018.

Matches and Mismatches in Nominal Morphology and Agreement: Learning from the Acquisition of Eegimaa

Theoretical accounts of the strategies used by children to learn the structures of words and grammatical features of languages differ considerably, but our knowledge of what is possible is limited by the existing focus on a relatively small number of languages associated with industrialised nations. Here, we investigate grammatical features and structures that may be expressed in a variety of different ways. Examples of grammatical features include number (eg the distinction between singular and plural), or gender (eg distinguishing masculine and feminine in languages like French), features expressed within the shape of the word and associated items. Grammatical structure may be manifested in agreement across the separate words of a noun phrase. This project investigates the acquisition of inflectional morphology, ie grammatical features and structures as reflected in the word forms and associated agreement, in Gújjolaay Eegimaa, a language of the Atlantic family of the Niger Congo phylum spoken in Southern Senegal. This language has a gender system of the type traditionally known as a noun class system. Noun class systems with complex gender agreement are characteristic of the Niger-Congo languages.

Morphological Complexity: Typology as a Tool for Delineating Cognitive Organization

This ERC-funded project is a comprehensive typological investigation of morphological complexity and involves collaboration with colleagues at Surrey and Brighton. Work at York focuses on two research strands. The first strand, Discovering Complexity, with Roger Evans (Brighton), concentrates on the machine learning of inflectional classes, where we investigate how much can be learned without building language-specific knowledge into the system. The second strand, with colleagues at Surrey, uses the Network Morphology theoretical framework to investigate defaults and irregularity in morphological systems.

The Oxford Corpus of Old Japanese

The Oxford Corpus of Old Japanese (OCOJ) project developed a comprehensive annotated digital corpus of all extant texts, with an associated dictionary and translations, from the Old Japanese period. This is the earliest attested stage of Japanese, from the Asuka and Nara periods of Japanese history (7-8th centuries AD), and the formative literate period of Japan. These texts are therefore of paramount importance for the study and understanding of the origins and development of civilization in Japan, including language, writing, literature, religion, history, and culture. The corpus is now maintained and hosted by NINJAL in Japan, and is designed to support research in any of these areas.