ISSN : 1229-2435
Author Name Disambiguation(AND) is a critical task in scholarly information systems; however, the applicability of the English-centric OpenAlex model to the Korean academic ecosystem has yet to be fully validated. This study evaluates OpenAlex’s performance using 54,049 papers (2023-2024) from KISTI’s OCEAN database and optimizes seven features tailored to Korean linguistic characteristics. Stepwise experiments demonstrate that the F1-score improved from 0.852 (v1-1) to 0.860 (v2-2), ultimately achieving an accuracy of 0.930 and an F1-score of 0.931 after ground-truth refinement. Cross-validation with ORCID yielded an F1-score of 0.892, confirming the model’s reliability. Specifically, we propose an optimization process that combines incremental processing with manual verification to manage large-scale data efficiently. Finally, the study validates a pipeline that successfully clusters 183,105 author records into 109,205 unique identifiers, verifying its practical feasibility and scalability for Korean scholarly metadata.