LLM-Based Content Analysis of Sampling Methodology in Library and Information Science Research: A Cross-Model Comparison of Coding Performance by Task Type

Min Sein; 민세인; Kim Eungi; 김은기

doi:10.16981/kliss.57.1.202603.413

오늘 하루 그만보기

P-ISSN2466-2542
KCI

홈으로

OA 정책

ISSN : 2466-2542

논문 상세

이전 다음

논문 투고

Vol.57 No.1

PDF Citation

문헌정보학 연구에서의 표집 방법론에 대한 대규모 언어모델 기반 내용 분석- 과업 유형에 따른 모델 간 코딩 수행 비교 -

LLM-Based Content Analysis of Sampling Methodology in Library and Information Science Research: A Cross-Model Comparison of Coding Performance by Task Type

한국도서관·정보학회지 / Journal of Korean Library and Information Science Society, (P)2466-2542;

2026, v.57 no.1, pp.413-438

https://doi.org/10.16981/kliss.57.1.202603.413

민세인(Sein Min) (계명대학교)
김은기(Eungi Kim) (계명대학교)

민세인, & 김은기. (2026). 문헌정보학 연구에서의 표집 방법론에 대한 대규모 언어모델 기반 내용 분석- 과업 유형에 따른 모델 간 코딩 수행 비교 -. , 57(1), 413-438, https://doi.org/10.16981/kliss.57.1.202603.413

복사

초록

본 연구의 목적은 문헌정보학 연구 방법 분석의 맥락에서 과업 유형에 따른 대규모 언어모델(LLM) 기반 내용 분석의 적용 가능 조건을 차원별로 비교․검토하는 데 있다. 이를 위해 2020년부터 2024년까지 국내 4대 문헌정보학 학회지에 게재된 설문 및 인터뷰 연구 100편을 층화무작위표집 방식으로 선정하고, 표집 방법론을 구성하는 12개 차원에 대해 인간 코더 1인과 4개 대규모 언어모델(Claude-3.5-Haiku, GPT-4o-Mini, Gemini-2.0-Flash, Grok-4-Latest)의 코딩 결과를 비교하였다. 분석 결과, 명시적 기준에 따라 분류가 가능한 차원에서는 상대적으로 높은 일치도가 나타난 반면, 추론적․평가적 판단을 요구하는 차원에서는 일관되게 낮은 수준의 일치도가 확인되었다. 이러한 결과는 LLM 기반 자동화 코딩의 성과가 모델 성능 자체보다는 과업의 판단 구조와 정보의 명시성에 더 크게 영향을 받음을 시사한다. 따라서 LLM의 활용 범위는 과업 유형 및 판단 특성 차원에서 보다 정교하게 검토될 필요가 있으며, 인간-AI 혼합 검증 전략의 체계적 설계가 요구된다.

keywords: 대규모 언어모델, 자동화 내용 분석, 코딩 신뢰도, 문헌정보학 연구방법론, 표집 방법론

Abstract

The purpose of this study is to compare and examine, across multiple dimensions, the conditions under which large language model (LLM)-based content analysis can be applied according to task type in the context of research methods analysis in library and information science. To this end, 100 survey and interview studies published between 2020 and 2024 in four major Korean journals in library and information science were selected using stratified random sampling. The coding results produced by one human coder and four large language models (Claude-3.5-Haiku, GPT-4o-Mini, Gemini-2.0-Flash, and Grok-4-Latest) were compared across twelve dimensions constituting sampling methodology. The results show that relatively high levels of agreement were observed in dimensions where classification could be made based on explicit criteria, whereas consistently lower levels of agreement appeared in dimensions requiring inferential or evaluative judgment. These findings suggest that the performance of LLM-based automated coding is influenced more by the decision structure of the task and the explicitness of the available information than by model performance itself. Therefore, the scope of LLM application should be more carefully examined from the perspectives of task type and judgment characteristics, and the systematic design of human-AI hybrid validation strategies is required.

keywords: Large Language Models, Automated Content Analysis, Coding Reliability, Library and Information Science Research Methods, Sampling Methodology

투고일Received: 2026-02-21

게재확정일Accepted: 2026-03-07

출판일Published: 2026-03-30

바로가기메뉴

논문 상세

Vol.57 No.1

문헌정보학 연구에서의 표집 방법론에 대한 대규모 언어모델 기반 내용 분석- 과업 유형에 따른 모델 간 코딩 수행 비교 -

LLM-Based Content Analysis of Sampling Methodology in Library and Information Science Research: A Cross-Model Comparison of Coding Performance by Task Type

초록

Abstract

한국도서관·정보학회지