Contributor Identification Is a Core Challenge in Data Publication

Gudmundur Thorisson

University of Iceland

Play (37min)

Download: MP4 | MP3

As in scholarly communication more generally, non-unique person names and the current lack of a global identification infrastructure for producers of scholarly content makes it difficult to establish the identity of authors and other contributors. This in turn makes it difficult to accurately attribute datasets published via online digital repositories to their creators – one of several key requirements for including these important outputs in the scholarly record.Recently launched international initiatives will soon make it possible to seriously start tackling these challenges globally. An emerging global data citation (DataCite: http://www.datacite.org) and contributor identification (ORCID: http://www.orcid.org) framework will provide the technical infrastructure needed to enable any research output be discovered, cited in a scholarly context and unambiguously attributed, thus enabling data creators to be recognized and rewarded for sharing these outputs. Along with other measures, such an incentive-based approach will be key to motivating sharing of data and other types of digital research outputs, especially in the life sciences where sharing is a perennial problem.The presentation will introduce some of the key concepts, initiatives and opportunities/challenges relating to contributor identification and attribution, and also summarize related informatics work in our group and collaborators in the GEN2PHEN project (http://www.gen2phen.org) which focuses on online dissemination of genetic variation and other research data.