A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time

  1. de-Dios-Flores, Iria
  2. García González, Marcos
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2022

Número: 69

Páginas: 15-26

Tipo: Artículo

Otras publicaciones en: Procesamiento del lenguaje natural

Resumen

Este trabajo analiza la capacidad de los modelos Transformer para capturar las dependencias de concordancia sujeto-verbo y sustantivo-adjetivo en gallego. Llevamos a cabo una serie de experimentos de predicción de palabras manipulando la longitud de la dependencia junto con la presencia de un sustantivo intermedio que actúa como distractor. En primer lugar, evaluamos el rendimiento global de los modelos monolingües y multilingües existentes para el gallego. En segundo lugar, para observar los efectos del proceso de entrenamiento, comparamos los diferentes grados de consecución de dos modelos monolingües BERT en diferentes puntos del entrenamiento. Además, publicamos sus puntos de control y proponemos una métrica de evaluación alternativa. Nuestros resultados confirman los hallazgos anteriores de trabajos similares que utilizan la tarea de predicción de concordancia y proporcionan una visión interesante sobre el número de pasos de entrenamiento que necesita un modelo Transformer para resolver las dependencias de larga distancia.

Referencias bibliográficas

  • Agerri, R., X. G´omez Guinovart, G. Rigau, and M. A. Solla Portela. 2018. Developing new linguistic resources and tools for the Galician language. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May. European Language Resources Association (ELRA).
  • Bernardy, J.-P. and S. Lappin. 2017. Using deep neural networks to learn syntactic agreement. In Linguistic Issues in Language Technology, Volume 15, 2017. CSLI Publications.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  • Garcia, M. 2021. Exploring the representation of word meanings in context: A case study on homonymy and synonymy. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3625–3640, Online, August. Association for Computational Linguistics.
  • Garcia, M. and A. Crespo-Otero. 2022. A Targeted Assessment of the Syntactic Abilities of Transformer Models for Galician-Portuguese. In International Conference on Computational Processing of the Portuguese Language (PROPOR 2022), pages 46–56. Springer.
  • Gauthier, J., J. Hu, E. Wilcox, P. Qian, and R. Levy. 2020. SyntaxGym: An online platform for targeted evaluation of language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 70–76, Online, July. Association for Computational Linguistics.
  • Goldberg, Y. 2019. Assessing BERT’s Syntactic Abilities. arXiv preprint arXiv:1901.05287.
  • Gulordava, K., P. Bojanowski, E. Grave, T. Linzen, and M. Baroni. 2018. Colorless green recurrent networks dream hierarchically. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1195–1205, New Orleans, Louisiana, June. Association for Computational Linguistics.
  • Henderson, J. 2020. The unstoppable rise of computational linguistics in deep learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6294–6306, Online, July. Association for Computational Linguistics.
  • Hewitt, J. and C. D. Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Kuncoro, A., C. Dyer, J. Hale, and P. Blunsom. 2018a. The perils of natural behaviour tests for unnatural models: the case of number agreement. Learning Language in Humans and in Machines, 5(6). https://osf.io/9usyt/.
  • Kuncoro, A., C. Dyer, J. Hale, D. Yogatama, S. Clark, and P. Blunsom. 2018b. LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1426–1436, Melbourne, Australia, July. Association for Computational Linguistics.
  • Lakretz, Y., G. Kruszewski, T. Desbordes, D. Hupkes, S. Dehaene, and M. Baroni. 2019. The emergence of number and syntax units in LSTM language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 11–20, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Lin, Y., Y. C. Tan, and R. Frank. 2019. Open sesame: Getting inside BERT’s linguistic knowledge. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 241–253, Florence, Italy, August. Association for Computational Linguistics.
  • Linzen, T., E. Dupoux, and Y. Goldberg. 2016. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 4:521–535.
  • Linzen, T. and B. Leonard. 2018. Distinct patterns of syntactic agreement errors in recurrent networks and humans. In Proceedings of the 40th Annual Conference of the Cognitive Science Society. arXiv preprint arXiv:1807.06882.
  • Marvin, R. and T. Linzen. 2018. Targeted syntactic evaluation of language models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1192–1202, Brussels, Belgium, October-November. Association for Computational Linguistics.
  • Mueller, A., G. Nicolai, P. Petrou-Zeniou, N. Talmina, and T. Linzen. 2020. Crosslinguistic syntactic evaluation of word prediction models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5523– 5539, Online, July. Association for Computational Linguistics.
  • Newman, B., K.-S. Ang, J. Gong, and J. Hewitt. 2021. Refining targeted syntactic evaluation of language models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3710– 3723, Online, June. Association for Computational Linguistics.
  • Pérez-Mayos, L., M. Ballesteros, and L.Wanner. 2021. How much pretraining data do language models need to learn syntax? In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1571–1582, Online and Punta Cana, Dominican Republic, November. Association for Computational Linguistics.
  • Pérez-Mayos, L., A. T´aboas Garc´ıa, S. Mille, and L. Wanner. 2021. Assessing the syntactic capabilities of transformer-based multilingual language models. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3799–3812, Online, August. Association for Computational Linguistics.
  • Sellam, T., S. Yadlowsky, J. Wei, N. Saphra, A. D’Amour, T. Linzen, J. Bastings, I. Turc, J. Eisenstein, D. Das, I. Tenney, and E. Pavlick. 2022. The Multi- BERTs: BERT Reproductions for Robustness Analysis. In The Tenth International Conference on Learning Representations (ICLR 2022). arXiv preprint arXiv:2106.16163.
  • Tran, K., A. Bisazza, and C. Monz. 2018. The importance of being recurrent for modeling hierarchical structure. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4731–4736, Brussels, Belgium, October-November. Association for Computational Linguistics.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention Is All You Need. arXiv preprint arXiv:1706.03762.
  • Vilares, D., M. Garcia, and C. G´omez- Rodr´ıguez. 2021. Bertinho: Galician BERT Representations. Procesamiento del Lenguaje Natural, 66:13–26.
  • Wei, J., D. Garrette, T. Linzen, and E. Pavlick. 2021. Frequency effects on syntactic rule learning in transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 932–948, Online and Punta Cana, Dominican Republic, November. Association for Computational Linguistics.
  • Wenzek, G., M.-A. Lachaux, A. Conneau, V. Chaudhary, F. Guzm´an, A. Joulin, and E. Grave. 2020. CCNet: Extracting high quality monolingual datasets from web crawl data. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 4003–4012, Marseille, France, May. European Language Resources Association.
  • Wolf, T., L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush. 2020. Transformers: Stateof- the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October. Association for Computational Linguistics.