Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams

Miranda, Sabino; Pichardo-Lagunas, Obdulia; Martínez-Seis, Bella; Baldi, Pierre

doi:10.13053/cys-27-4-4790

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

MIRANDA, Sabino; PICHARDO-LAGUNAS, Obdulia; MARTINEZ-SEIS, Bella y BALDI, Pierre. Evaluating the Performance of Large Language Models for Spanish Language in Undergraduate Admissions Exams. Comp. y Sist. [online]. 2023, vol.27, n.4, pp.1241-1248. Epub 17-Mayo-2024. ISSN 2007-9737. https://doi.org/10.13053/cys-27-4-4790.

This study evaluates the performance of large language models, specifically GPT-3.5 and BARD (supported by Gemini Pro model), in un-dergraduate admissions exams proposed by the National Polytechnic Institute in Mexico. The exams cover Engineering/Mathematical and Physical Sciences, Biological and Medical Sciences, and Social and Administrative Sciences. Both models demonstrated proficiency, exceeding the minimum acceptance scores for respective academic programs to up to 75% for some academic programs. GPT-3.5 outperformed BARD in Mathematics and Physics, while BARD performed better in History and questions related to factual information. Overall, GPT-3.5 marginally surpassed BARD with scores of 60.94% and 60.42%, respectively.

Palabras llave : Large Language Models; ChatGPT; BARD; Undergraduate Admissions Exams.

· texto en Inglés · Inglés (

pdf )