Towards a FAIR Dataset for Spanish Non-Functional Requirements

  1. María Isabel Limaylla Lunarejo
  2. Nelly Condori Fernandez 1
  3. Miguel R. Luaces 2
  1. 1 Universidade de Santiago de Compostela
    info

    Universidade de Santiago de Compostela

    Santiago de Compostela, España

    ROR https://ror.org/030eybx10

  2. 2 Universidade da Coruña
    info

    Universidade da Coruña

    La Coruña, España

    ROR https://ror.org/01qckj285

Libro:
VI Congreso XoveTIC: impulsando el talento científico
  1. Manuel Lagos Rodríguez (ed. lit.)
  2. Álvaro Leitao Rodríguez (ed. lit.)
  3. Tirso Varela Rodeiro (ed. lit.)
  4. Javier Pereira Loureiro (coord.)
  5. Manuel Francisco González Penedo (coord.)

Editorial: Servizo de Publicacións ; Universidade da Coruña

Ano de publicación: 2023

Congreso: XoveTIC (6. 2023. A Coruña)

Tipo: Achega congreso

Resumo

Supervised Machine Learning algorithms (ML) have enhanced the performance of the automatic non-functional requirements (NFR) classification in the Requirements Engineering domain. However, the lack of public datasets, dealing with imbalanced datasets and reproducibility are current concerns in ML experiments. We conducted a quasi-experiment to generate a dataset of NFR in the Spanish Language, following the FAIR Principles. We collected 109 requirements from an open access repository of the University of A Coru˜ na, and performed a labeling process based in the categories and subcategories of the ISO/IEC 25010 quality model. Using a Fleiss’ Kappa test we obtained a substantial agreement (0.78) at the category level and a moderate agreement (0.48) when the classification is per subcategory supervised Machine Learning algorithms (ML) have enhanced the performance of the automatic non-functional requirements (NFR) classification in the Requirements Engineering domain. However, the lack of public datasets, dealing with imbalanced datasets and reproducibility are current concerns in ML experiments. We conducted a quasi-experiment to generate a dataset of NFR in the Spanish Language, following the FAIR Principles. We collected 109 requirements from an open access repository of the University of A Coruña, and performed a labeling process based in the categories and subcategories of the ISO/IEC 25010 quality model. Using a Fleiss’ Kappa test we obtained a substantial agreement (0.78) at the category level and a moderate agreement (0.48) when the classification is per subcategory