New paper: Overton: A Bibliometric database of policy document citations

For the past few months, I have been working with the Overton team to understand the makeup of materials indexed in the Overton database and the nature in which they cite academic literature. I’m pleased to announce that the results of this analysis have been published on arXiv this week and also featured in Times Higher Education

The findings are promising, showing that Overton indexes a sufficiently large quantity of data to make citation indicators practical, with especially good coverage in areas of social science such as economics, political science, sociology, and education.

As someone with a keen interest in citation data and metrics, I have been itching to get my hands on the Overton data since I first heard about it from Euan back in 2019. It promises new opportunities to understand how research is utilised outside of academia and ties in nicely to the contemporary research evaluation focus on impact. 

Overton has done a great job tackling the onerous task of wrangling out references to research from a variety of publicly available reports, whitepapers, guidelines and other grey-literature content, creating a rich citation network that offers novel analytical opportunities. Overton indexes data from more than 30,000 international sources, collecting more than 5 million documents that contain over 14 million references.

Our paper Overton: A Bibliometric database of policy document citations is a high-level survey of the bibliometric potential of the policy citation database, answering basic questions relating to the variety of documents and publication sources indexed and the network of citations that is extracted. Key aspects of the analysis include:

  • What is the makeup of the database in terms of sources indexed by geography, language, type and year of publication?
  • How many scholarly references are extracted and over what time period?
  • How long does it take research articles to accumulate policy citations and how does this vary across disciplines?
  • What is the time-lag between the publication of scholarly works and their citation within policy literature and how does this vary between disciplines?
  • What statistical distribution best models policy citation counts to research articles? 
  • How feasible is field-based citation normalization?
  • Do the citations tracked in the policy literature correlate with policy influence outcomes attributed to funded grants?
  • Does the amount of policy citation correlate with peer-review assessment scores as reported in the UK REF2014 impact case study data?

One important aspect of the paper is a comparison with data collected via ResearchFish. Since 2014, all UK funders have asked academics to report the variety of outcomes associated with funded grants such as spin-out companies, intellectual property, public engagement, and crucially policy influence. Therefore, it is possible to test whether this self-reported data broadly agrees with that automatically collected by Overton. The agreement is good in some research areas suggesting that Overton may become a valuable resource in future research evaluation systems. 

This is a really exciting development in the research impact space that I believe has the potential to enhance the visibility of the academic-policy interface.

We hope you find it interesting and we’d love to know what you think! @overtonio @martinszomsor 

