Another score for artificial intelligence. In a recent study, 151 human volunteers competed against ChatGPT-4 in three assessments designed to assess divergent thinking, which is thought to be a predictor of creative thinking.
Divergent thinking is defined as the ability to develop a unique response to a question with no expected answer, such as “What is the best way to avoid talking about politics with my parents?” In the study, GPT-4 produced more unique and elaborate responses than human volunteers.
The study, “The current state of artificial intelligence generative language models is more creative than humans on divergent thinking tasks,” was published in Scientific Reports and authored by U of A Ph.D. students in psychological science Kent F. Hubert and Kim N. Awa, as well as Darya L. Zabelina, an assistant professor of psychological science at the U of A and director of the Mechanisms of Creative Cognition and Attention Lab.
The number of responses, response length, and semantic distinction between terms were all considered while evaluating answers. Finally, the authors discovered that – Overall, GPT-4 was more original and elaborate than humans on each of the divergent thinking tasks, even after controlling for response fluency.
Kent F. Hubert and Kim N. Awa
The three tests used were the Alternative Use Task, which asks participants to think of creative uses for everyday objects like a rope or a fork; the Consequences Task, which asks participants to imagine possible outcomes of hypothetical situations, such as “what if humans no longer needed sleep?”; and the Divergent Associations Task, which asks participants to generate ten nouns that are as semantically distant as possible. For example, there is little semantic disparity between “dog” and “cat,” yet there is a lot between words like “cat” and “ontology.”
The number of responses, response length, and semantic distinction between terms were all considered while evaluating answers. Finally, the authors discovered that “Overall, GPT-4 was more original and elaborate than humans on each of the divergent thinking tasks, even after controlling for response fluency.” In other words, GPT-4 displayed greater creative potential across a wide range of divergent thinking challenges.
There are some caveats to this finding. The researchers write: “It is important to note that the measures used in this study are all measures of creative potential, but the involvement in creative activities or achievements are another aspect of measuring a person’s creativity.” The study’s goal was to look at human-level creative potential, rather than those with established creative credentials.
Hubert and Awa further point out that “AI, unlike humans, does not have agency” and is “dependent on the assistance of a human user.” As a result, unless provoked, AI’s creative potential remains untapped.
Furthermore, the researchers did not assess the appropriateness of the GPT-4 responses. So, while the AI may have produced more and more creative solutions, human participants may have felt confined by the need that their comments be anchored in reality.
Awa also acknowledged that the human motivation to write elaborate answers may not have been high, and said there are additional questions about “how do you operationalize creativity? Can we really say that using these tests for humans is generalizable to different people? Is it assessing a broad array of creative thinking? So I think it has us critically examining what are the most popular measures of divergent thinking.”
The question of whether the tests are perfect assessments of human creative capacity is not really relevant. The idea is that huge language models are fast evolving and exceeding people in ways they have never done before. Whether they pose a threat to human creativity remains to be seen. At this point, the authors say, “Moving forward, future possibilities of AI acting as a tool of inspiration, as an aid in a person’s creative process or to overcome fixedness is promising.”