바로가기메뉴

본문 바로가기 주메뉴 바로가기
 
 

logo

  • P-ISSN2287-9099
  • E-ISSN2287-4577
  • SCOPUS, KCI

When Does Data Integration Enhance Predictive Performance? An Empirical Analysis of Open Government Data

JOURNAL OF INFORMATION SCIENCE THEORY AND PRACTICE / JOURNAL OF INFORMATION SCIENCE THEORY AND PRACTICE, (P)2287-9099; (E)2287-4577
2025, v.13 no.4, pp.83-99
https://doi.org/10.1633/JISTaP.2025.13.4.6
Junyoung Jeong (National Information Society Agency, Daegu, Korea)

Abstract

While open government data (OGD) is increasingly recognized as a critical resource for economic growth and data-driven innovation, methods for proactively evaluating the potential utilization of these datasets remain underdeveloped. This study addresses this gap by investigating two key methodological questions: first, whether automated machine learning (AutoML) is an appropriate tool for measuring and evaluating OGD utilization, and second, how the composition of training data affects the performance of models designed to predict such utilization. This research specifically compares the efficacy of two distinct data strategies: models trained on integrated datasets spanning multiple domains versus those trained on domain-specific datasets. Using metadata from the South Korean government’s extensive OGD portal, this study employs AutoML to systematically build and evaluate predictive models under these different training conditions. The findings reveal that the training data strategy is a critical determinant of predictive accuracy, with the integrated-domain approach frequently yielding superior performance over domain-specific models. This research provides empirical evidence on the impact of data integration strategies in this context and establishes a methodological framework for the prospective assessment of OGD value, offering a more robust alternative to traditional retrospective evaluation metrics.

keywords
data integration strategy, prediction model, predictive performance, automated machine learning, open government data, open government data utilization

투고일Received
2025-07-10
수정일Revised
2025-10-20
게재확정일Accepted
2025-10-31
출판일Published
2025-12-30

JOURNAL OF INFORMATION SCIENCE THEORY AND PRACTICE