Text Mining meets Crowd Sourcing: author disambiguation in High-Energy Physics

Salvatore Mele
Head of Open Access, CERN

Author disambiguation is a systemic issue in scholarly publishing, and the field of high-energy physics is an interesting example of issues in the management of authors… as some recent articles from the LHC project have over 2,500 authors (yes, you read well, that’s two thousands five hundreds).

A hybrid experiment in author disambiguation in the field, combining text-mining and crow-sourcing has been running in INSPIRE in the last year.