New covariates selection approaches in high dimensional or functional regression models
- Laura Freijeiro González
- Wenceslao González Manteiga Director
- Manuel Febrero Bande Director
Universidade de defensa: Universidade de Santiago de Compostela
Fecha de defensa: 30 de xuño de 2023
- Rosa Elvira Lillo Rodríguez Presidente/a
- Beatriz Pateiro López Secretaria
- Christophe Ley Vogal
Tipo: Tese
Resumo
In a Big Data context, the number of covariates used to explain a variable of interest, p, is likely to be high, sometimes even higher than the available sample size (p > n). Ordinary procedures for fitting regression models start to perform wrongly in this situation. As a result, other approaches are needed. A first covariates selection step is of interest to consider only the relevant terms and to reduce the problem dimensionality. The purpose of this thesis is the study and development of covariates selection techniques for regression models in complex settings. In particular, we focus on recent high dimensional or functional data contexts of interest. Assuming some model structure, regularization techniques are widely employed alternatives for both: model estimation and covariates selection simultaneously. Specifically, an extensive and critical review of penalization techniques for covariates selection is carried out. This is developed in the context of the high dimensional linear model of the vectorial framework. Conversely, if no model structure wants to be assumed, stateof- the-art dependence measures based on distances are an attractive option for covariates selection. New specification tests using these ideas are proposed for the functional concurrent model. Both versions are considered separately: the synchronous and the asynchronous case. These approaches are based on novel dependence measures derived from the distance covariance coefficient.