This data is derived from Aminer's publicly available disambiguation dataset (https://open.aminer.cn/article?id=55af4228dabfae1ce3ed1253), on the basis of which it explores how the characterization of research collaborations affects the novelty and impact of knowledge outcomes