An Exploratory Study on the Feasibility of Using Generative AI to Automate Library Performance Evaluation

Na Min Kyung; 나민경; Oh Ji Eun; 오지은; Lee Jee Yeon; 이지연

doi:10.3743/KOSIM.2025.42.4.277

오늘 하루 그만보기

P-ISSN1013-0799
E-ISSN2586-2073
KCI

Home

OA Policy

ISSN : 1013-0799

Article Contents

Prev Next

e-Submission

Vol.42 No.4

PDF Citation

An Exploratory Study on the Feasibility of Using Generative AI to Automate Library Performance Evaluation

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073

2025, v.42 no.4, pp.277-301

https://doi.org/10.3743/KOSIM.2025.42.4.277

Min Kyung Na (Yonsei University)

Ji Eun Oh (Seoul Metropolitan Library)
Jee Yeon Lee (Yonsei University)

Na, M. K., Oh, J. E., & Lee, J. Y. (2025). An Exploratory Study on the Feasibility of Using Generative AI to Automate Library Performance Evaluation. , 42(4), 277-301, https://doi.org/10.3743/KOSIM.2025.42.4.277

copy

Abstract

This study is an exploratory research that applies generative AI-based automated assessment to public library performance evaluation and examines its feasibility for adoption. To this end, we compared the evaluation results produced by a human expert in library and information science and by a generative AI system. The comparison focused on four domains of the current evaluation indicators that are scored by humans on the basis of submitted documents: space, collaboration, management planning, and best practices, and examined changes in reliability according to different prompt-engineering techniques. Using ChatGPT 5.1, we conducted automated evaluations on the documents submitted by 164 public libraries in Seoul for the 2024 public library performance evaluation. The results indicated that for domains with relatively simple content and clearly defined rating scales—space, collaboration, and management planning—the agreement between expert and AI scores was high. In contrast, in the best practices domain, which requires qualitative judgment, the discrepancy between expert and AI evaluation results was substantial. Furthermore, the highest level of reliability between expert and AI scores was observed under the condition that combined Task Information (TI) prompts, which provide structured input of the information required for evaluation, with Demonstration Information (DI) prompts, which offer illustrative examples. In particular, in the qualitative assessment domain, reliability improved significantly when DI prompts were added.

keywords: Generative AI, library performance evaluation, evaluation automation, AI based assessment, prompt, 생성형 AI

Received: 2025-11-22

Revised: 2025-12-04

Accepted: 2025-12-08

Published: 2025-12-30

바로가기메뉴

Article Contents

Vol.42 No.4

An Exploratory Study on the Feasibility of Using Generative AI to Automate Library Performance Evaluation

Abstract

Journal of the Korean Society for Information Management