Evaluating Contextualized Vectors from both Large Language Models and Compositional Strategies

Gamallo Otero, Pablo; García González, Marcos; de-Dios-Flores, Iria

Evaluating Contextualized Vectors from both Large Language Models and Compositional Strategies

Revista:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Ano de publicación: 2022

Número: 69

Páxinas: 153-164

Tipo: Artigo

DIALNET GOOGLE SCHOLAR RUA editor

Outras publicacións en: Procesamiento del lenguaje natural

Resumo

En este artículo, comparamos los vectores contextualizados derivados de grandes modelos de lenguaje con los generados mediante técnicas de composición basadas en dependencias sintácticas. Para ello, nos servimos de una tarea de similitud de palabras en contextos controlados. Como se trata de una experimentación orientada a la lengua gallega, creamos un nuevo conjunto de datos de evaluación en gallego para esta tarea semántica específica. Los resultados muestran que los vectores composicionales derivados de enfoques sintácticos basados en restricciones de selección son competitivos con los embeddings contextuales derivados de los modelos de lenguaje de gran tamaño basados en arquitecturas neuronales.

Referencias bibliográficas

Armendariz, C. S., M. Purver, S. Pollak, N. Ljubesic, M. Ulcar, M. Robnik-Sikonja, I. Vulic, and M. T. Pilehvar. 2020. SemEval-2020 task 3: Graded word similarity in context (GWSC). In Proceedings of the 14th International Workshop on Semantic Evaluation.
Asaadi, S., S. Mohammad, and S. Kiritchenko. 2019. Big BiRD: A large, finegrained, bigram relatedness dataset for examining semantic composition. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 505–516, Minneapolis, Minnesota, June. Association for Computational Linguistics.
Bai, J., Y. Wang, Y. Chen, Y. Yang, J. Bai, J. Yu, and Y. Tong. 2021. Syntax-BERT: Improving pre-trained transformers with syntax trees. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 3011–3020, Online. Association for Computational Linguistics.
Baroni, M. 2013. Composition in distributional semantics. Language and Linguistics Compass, 7:511–522.
Baroni, M., R. Bernardi, and R. Zamparelli. 2014. Frege in space: A program for compositional distributional semantics. Linguistic Issues in Language Technology (LiLT), 9:241–346.
Biemann, C. and M. Riedl. 2013. Text: Now in 2d! a framework for lexical expansion with contextual similarity. Journal of Language Modelling, 1(1):55–95.
Bowman, S. R., G. Angeli, C. Potts, and C. D. Manning. 2015. A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632–642, Lisbon, Portugal. Association for Computational Linguistics.
Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-2019, Volume 1, pages 4171– 4186, Minneapolis, Minnesota. Association for Computational Linguistics.
Erk, K. and S. Pado. 2008. A structured vector space model for word meaning in context. In 2008 Conference on Empirical Methods in Natural Language Processing (EMNLP-2008, pages 897–906, Honolulu, HI.
Ethayarajh, K. 2019. How contextual are contextualized word representations? comparing the geometry of bert, elmo, and gpt-2 embeddings. In K. Inui, J. Jiang, V. Ng, and X. Wan, editors, EMNLP/IJCNLP (1), pages 55–65. Association for Computational Linguistics.
Gamallo, P., M. Garcia, C. Pineiro, R. Martinez-Castano, and J. C. Pichel. 2018. LinguaKit: A Big Data-Based Multilingual Tool for Linguistic Analysis and Information Extraction. In 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS), pages 239–244.
Gamallo, P. 2017. Comparing explicit and predictive distributional semantic models endowed with syntactic contexts. Language Resources and Evaluation, 51(3):727–743.
Gamallo, P. 2019. A dependency-based approach to word contextualization using compositional distributional semantics. Language Modelling, 7(1):53–92.
Gamallo, P. 2021. Compositional distributional semantics with syntactic dependencies and selectional preferences. Applied Sciences, 11(12).
Gamallo, P., M. P. Corral, and M. Garcia. 2021. Comparing dependency-based compositional models with contextualized word embedding. In 13th International Conference on Agents and Artificial Intelligence (ICAART-2021).
Gamallo, P., S. Sotelo, J. R. Pichel, and M. Artetxe. 2019. Contextualized translations of phrasal verbs with distributional compositional semantics and monolingual corpora. Computational Linguistics, 45(3):395–421.
Gamer, M., J. Lemon, I. Fellows, and P. Singh, 2019. irr: Various Coefficients of Interrater Reliability and Agreement. R package version 0.84.1.
Garcia, M. 2021. Exploring the representation of word meanings in context: A case study on homonymy and synonymy. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3625–3640, Online, August. Association for Computational Linguistics.
Grefenstette, E. and M. Sadrzadeh. 2011a. Experimental support for a categorical compositional distributional model of meaning. In Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), pages 1394–1404.
Grefenstette, E. and M. Sadrzadeh. 2011b. Experimenting with transitive verbs in a discocat. In Workshop on Geometrical Models of Natural Language Semantics (EMNLP 2011).
Kartsaklis, D. and M. Sadrzadeh. 2013. Prior disambiguation of word tensors for constructing sentence vectors. In Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), pages 1590–1601.
Lin, B. Y., S. Lee, X. Qiao, and X. Ren. 2021. Common sense beyond english: Evaluating and improving multilingual language models for common-sense reasoning. CoRR, abs/2106.06937.
Mikolov, T., W.-t. Yih, and G. Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 746–751, Atlanta, Georgia.
Mitchell, J. and M. Lapata. 2008. Vectorbased models of semantic composition. In Proceedings of the Association for Computational Linguistics: Human Language Technologies (ACL-08: HLT), pages 236– 244, Columbus, Ohio.
Mitchell, J. and M. Lapata. 2010. Composition in distributional models of semantics. Cognitive Science, 34(8):1388–1439.
Nguyen, X.-P., S. Joty, S. C. H. Hoi, and R. Socher. 2020. Tree-structured attention with hierarchical accumulation.
Padro, M., M. Idiart, A. Villavicencio, and C. Ramisch. 2014. Nothing like good old frequency: Studying context filters for distributional thesauri. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, pages 419–424.
Pilehvar, M. T. and J. Camacho-Collados. 2019. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1267–1273, Minneapolis, Minnesota. Association for Computational Linguistics.
Reimers, N. and I. Gurevych. 2019. Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics.
Salazar, J., D. Liang, T. Q. Nguyen, and K. Kirchhoff. 2020. Masked language model scoring. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.
Shibayama, N., R. Cao, J. Bai, W. Ma, and H. Shinnou. 2020. Evaluation of pretrained BERT model by using sentence clustering. In Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation, pages 279–285, Hanoi, Vietnam, October. Association for Computational Linguistics.
Vilares, D., M. Garcia, and C. Gomez- Rodríguez. 2021. Bertinho: Galician BERT Representations. Procesamiento del Lenguaje Natural, 66:13–26.
Vulic, I., E. M. Ponti, R. Litschko, G. Glavas, and A. Korhonen. 2020. Probing pretrained language models for lexical semantics. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7222–7240, Online, November. Association for Computational Linguistics.
Wang, A., A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman. 2019. GLUE: A multi-task benchmark and analysis platform for natural language understanding. In ICLR 2019.
Weir, D. J., J. Weeds, J. Reffin, and T. Kober. 2016. Aligning packed dependency trees: A theory of composition for distributional semantics. Computational Linguistics, 42(4):727–761.
Wijnholds, G., M. Sadrzadeh, and S. Clark. 2020. Representation learning for typedriven composition. In Proceedings of the 24th Conference on Computational Natural Language Learning, pages 313– 324, Online. Association for Computational Linguistics.
Williams, A., N. Nangia, and S. Bowman. 2018. A broad-coverage challenge corpus for sentence understanding through inference. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1112–1122, New Orleans, Louisiana. Association for Computational Linguistics.
Yu, L. and A. Ettinger. 2020. Assessing phrasal representation and composition in transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 4896–4907, Online, November. Association for Computational Linguistics.

Fonte de datos: Dialnet