AI Psychometrics: Evaluating the Psychological Reasoning of Large Language Models with Psychometric Validities
arXiv cs.AI / 3/13/2026
📰 NewsIdeas & Deep AnalysisModels & Research
Key Points
- AI Psychometrics applies psychometric validity frameworks to evaluate the psychological reasoning of large language models, proposing a systematic evaluation approach.
- The study assesses GPT-3.5, GPT-4, LLaMA-2, and LLaMA-3 using the Technology Acceptance Model to test convergent, discriminant, predictive, and external validity.
- All four models meet the validity criteria, with GPT-4 and LLaMA-3 showing higher psychometric validity than GPT-3.5 and LLaMA-2.
- The findings support the viability of applying AI Psychometrics to interpret LLMs and enable cross-model comparisons of psychological traits.
- The work contributes to AI evaluation methodology by linking model performance with psychometric validity, suggesting new directions for model assessment.




