Massive language fashions excel at creating and fixing emotional intelligence checks, research finds

Picture illustrating the type of situations utilized in emotional intelligence checks, together with transient explanations that consider the emotional reasoning behind every response. Credit score: Katja Schlegel.

All through the course of their lives, people can set up significant social connections with others, empathizing with them and sharing their experiences. Individuals’s means to handle, understand and perceive the feelings skilled by each themselves and others is broadly known as emotional intelligence (EI).

Over the previous a long time, psychologists have developed numerous checks designed to measure EI, which usually assess folks’s means to unravel emotion-related issues that they might encounter of their on a regular basis lives. These checks may be integrated into numerous psychological assessments employed in analysis, medical, skilled and academic settings.

Researchers on the College of Bern and the College of Geneva just lately carried out a research assessing the power of huge language fashions (LLMs), the machine studying methods underpinning the performance of conversational brokers like ChatGPT, to unravel and create EI checks. Their findings, revealed in Communications Psychology, counsel that LLMs can remedy these checks virtually in addition to people and may very well be promising instruments for creating future psycho-metric EI checks.

“I’ve been researching EI for many years and developed several performance-based tests to measure people’s ability to accurately recognize, understand, and regulate emotions in themselves and others,” Katja Schlegel, first creator of the paper, advised Medical Xpress.

“When ChatGPT and other large language models became widely available and many of my colleagues and I began testing them in our work, it felt natural to ask: how would these models perform on the very EI tests we had created for humans? At the same time, a lively scientific debate is unfolding around whether AI can truly possess empathy—the capacity to understand, share, and respond to others’ emotions.”

EI and empathy are two carefully linked ideas, as they’re each related to the power to grasp the emotional experiences of others. Schlegel and her colleagues Nils R. Sommer and Marcello Mortillaro got down to discover the extent to which LLMs might remedy and create emotion-related issues in EI checks, as this might additionally supply some indication of the extent of empathy they possess.

To attain this, they first requested six extensively used LLMs to finish 5 EI checks that have been initially designed for people as a part of psychological evaluations. The fashions they examined included ChatGPT-4, CHatGPT-o1, Gemini 1.5 flash, Copilot 365, Claude 3.5, Haiku and DeepSeek V3.

“The EI tests we used present short emotional scenarios and ask for the most emotionally intelligent response, such as identifying what someone is likely feeling or how best to manage an emotional situation,” defined Schlegel. “We then compared the models’ scores to human averages from previous studies.”

Using large language models to create and solve emotional intelligence tests

Picture displaying the share of appropriate responses throughout the 5 EI checks for every of the examined LLMs. Credit score: Katja Schlegel.

Within the second a part of their experiment, the researchers requested ChatGPT-4, some of the latest variations of ChatGPT launched to the general public, to create fully new variations of the EI checks used of their experiments. These checks ought to embody completely different emotional situations, questions and reply choices whereas additionally specifying what the proper responses to the questions are.

“We then gave both the original and AI-generated tests to over 460 human participants to see how both versions compared in terms of difficulty, clarity, realism, and how well they correlated with other EI tests and a measure of traditional cognitive intelligence,” stated Schlegel.

“This allowed us to test not just whether LLMs can solve EI tests, but whether they can reason about emotions deeply enough to build valid tests themselves, which we believe is an important step toward applying such reasoning in more open-ended, real-world settings.”

Notably, Schlegel and her colleagues discovered that the LLMs they examined carried out very nicely on all EI checks, attaining a mean accuracy of 81%, which is greater than the typical accuracy achieved by human respondents (56%). Their outcomes counsel that current LLMs are already a lot better at understanding what folks may really feel in numerous contexts, not less than in the case of structured conditions like these outlined in EI checks.

“Even more impressively, ChatGPT-4 was able to generate entirely new EI test items that were rated by human participants as similarly clear and realistic as the original items and showed comparable psychometric quality,” stated Schlegel. “In our view, the ability to both solve and construct such tests reflects a high level of conceptual understanding of emotions.”

The outcomes of this latest research might encourage psychologists to make use of LLMs to develop EI checks and coaching supplies, that are presently accomplished manually and may be pretty time consuming. As well as, they may encourage using LLMs for producing tailor-made role-play situations and different content material for coaching social employees.

“Our findings are also relevant for the development of social agents such as mental health chatbots, educational tutors, and customer service avatars, which often operate in emotionally sensitive contexts where understanding human emotions is essential,” added Schlegel.

“Our results suggest that LLMs, at the very least, can emulate the emotional reasoning skills that serve as a prerequisite for such interactions. In our next studies, we plan to test how well LLMs perform in less structured, real-life emotional conversations beyond the controlled format of test items. We also want to explore how culturally sensitive their emotional reasoning is since current models are primarily trained on Western-centric data.”

Extra info:
Katja Schlegel et al, Massive language fashions are proficient in fixing and creating emotional intelligence checks, Communications Psychology (2025). DOI: 10.1038/s44271-025-00258-x.

Quotation:
Massive language fashions excel at creating and fixing emotional intelligence checks, research finds (2025, June 4)
retrieved 4 June 2025
from https://medicalxpress.com/information/2025-06-large-language-excel-emotional-intelligence.html

This doc is topic to copyright. Aside from any truthful dealing for the aim of personal research or analysis, no
half could also be reproduced with out the written permission. The content material is offered for info functions solely.

Massive language fashions excel at creating and fixing emotional intelligence checks, research finds

Follow US

Popular News

The Secret Power of Reconnecting With Old Friends

Categories

About US

Company

Contact Us

Term of Use