Presentation

Benchmarking Economic Reasoning in Artificial Intelligence Models
DescriptionA theory-informed test of reasoning in artificial intelligence (AI) combines three sequential steps to consider correct answers as the result of a reasoning process as opposed to luck of probabilistic word matching. The first step is information filtering, where an AI model that reasons must distinguish the relevant information in a prompt from trivia. In the second step, knowledge association, the AI combines implicit or explicit knowledge with the relevant prompt information. And finally in the third step of logic attribution, a reasoning AI assigns correct logic operations for deducive, inducive, and other types of logic to uncover the corret answer. In economic settings, the logic steps involve different levels of counterfactual considerations and policy-relevant thought experiments. This paper leverages insights from the large language model benchmarking literature and the social economics literature to inform the design of benchmarking tests that are challenging, robust, evolving over time and informative about any type of reasoning shortcomings. The benchmarking process can be adapted to other sciences. An accompanying training dataset is available to help AI developers improve reasoninig in their models, and interested users can submit proposals for material to create questions.
TimeWednesday, June 511:30 - 12:00 CEST
LocationHG D 1.2
Event Type
Minisymposium
Domains
Applied Social Sciences and Humanities
Computational Methods and Applied Mathematics