Técnicas estadísticas en geolingüísticamodelización onomástica

  1. Ginzo Villamayor, María José
Dirixida por:
  1. Rosa M. Crujeiras Casais Director

Universidade de defensa: Universidade de Santiago de Compostela

Fecha de defensa: 20 de maio de 2022

Tribunal:
  1. Ana Pérez González Presidente/a
  2. Francisco Dubert García Secretario
  3. Raquel Menezes Vogal
Departamento:
  1. Departamento de Estatística, Análise Matemática e Optimización

Tipo: Tese

Resumo

This dissertation is focused on the introduction of new statistical methods for data processing and modeling in geolinguistics, specifically, on surnames in Galicia. The document considers two main problems: (i) constructing of surname regions in Galicia and (ii) modeling spatial and spatio-temporal surname patterns in this region. Surnames can be used as a source of information to characterize the population of a region, since the analysis of surnames distribution provides information about population movements. The identification of surname patterns through isonymy measures has been studied by different authors. Although there is a broad literature on the construction of regions of surnames by isonymy measures, methodological advances are scarce in this setting, and most of the proposals are based on classical measures (Lasker, Nei, isonymy between zones are the most usual). Taking into account that the traditional isonymy measures arise as an adjustment of classical biodiversity indexes, the first objective of this project was the adaptation and proposal of new measures of biodiversity in onomastics. Suitable extensions were analyzed, performing simulation studies in order to evaluate their performance in different scenarios. Other biodiversity indices were also reviewed and adapted. In addition, the different research lines within the onomastic context have not taken into account the spatial and spatio-temporal dimension of the surnames evolution. By fixing administrative regions, for example, municipalities, spatial and spatio-temporal methods for count data can be applied in this setting. These methods will be useful for modeling evolution patterns for surnames. Hierarchical modeling were used to meet this goal. In order to fit this type of models in practice, Integrated Nested Laplace Approximation, were explored. Finally, the third objective of this thesis is to produce an open source statistical library, so the different methods will be available for other practitioners.