ISSN : 1229-2435
Assessing the credibility of online health information has become increasingly complex as the volume of user-generated content (UGC) increases. This study investigates the predictive modeling of credibility in two distinct types of UGC platforms—Yahoo! Answers and Yelp—by exploring the impact of feature categories and the role of assessors’ prior knowledge. A total of 2,000 labeled instances were collected through crowdsourcing, using a rigorously validated credibility instrument and qualification process. Eighty-four features were developed and grouped into categories informed by the Elaboration Likelihood Model (ELM), and feature ablation studies were conducted independently on both datasets. Results indicate that content informativeness was the most discriminative factor for Yahoo! Answers, while sentiment and content informativeness were significant for Yelp. Interestingly, prior knowledge had a platform-dependent effect: it reduced model performance in Yahoo! Answers, likely due to overconfidence and limited domain expertise, but improved performance in Yelp, where lived experience aligned with subjective content. These findings emphasize the importance of tailoring credibility assessments and feature sets to the type of platform and the nature of the content.
