Overton Blog

Behind the numbers: using the Overton data for new rankings

We’ve got complicated feelings about rankings using policy citation data. They can be interesting and sometimes genuinely useful if presented correctly, but have a mixed reputation for good reason. In this blog post we explore that context and share some general rules of thumb to bear in mind when using Overton data in this way.

We get asked a lot about rankings - of people, papers and policy documents - sometimes by people who already have expertise in those areas but often by interested academics or our users at NGOs, think tanks or government agencies.

We’ve got complicated feelings about rankings. They can be genuinely interesting or useful (we’ve posted a few) but can also be easy to misinterpret or misuse.

To explain why it’s helpful we want to first lay out some context - remembering that Overton originally came out of the academic metrics world.

University rankings in academia

With more and more people attending university rankings seem like an attractive way for prospective students - and faculty & funders - to differentiate between institutions. The THE, QS and Shanghai rankings are all popular and if you’re an academic you’ve probably been indirectly praised (or unfairly criticised) because of one. 

In these large, complicated ranking systems university performance is typically rated across several different indicators - some relating to teaching and student experience and others related to academic outputs. Research is inevitably a key criteria: an institution’s scholarly output might be evaluated based on its volume, ‘quality’ (sometimes questionably), associated funding or its impact based on citation count or more recently broader social influence.

Many see the rankings as an important tool, arming people with more information about the university so they can make more informed choices. That’s as may be, but universities end up unduly fixated upon them, conveniently ignoring shortcomings when it suits, and this has is widely criticised by experts: Lizzie Gadd has a great piece on why institutions that support responsible metrics should think carefully about how they use rankings data (it’s also worth checking out Lizzie’s follow-up work, on the More Than Our Rank initiative).

Critics also worry that the focus on things like league tables and benchmarks distract from the social mission of universities. Others are concerned that rankings re-entrench hierarchies and make it more difficult for smaller, newer or poorer universities to succeed, as funding tends to concentrate in the ‘top’ universities. 

Using the Overton data in rankings

All this means that we’re keen to not go down the same path in the policy world. Of course the stakes are different, there aren’t millions of students using policy benchmarks to decide where to spend their $. But we can still learn from where academic rankings go wrong. 

We generally try to stay away from drawing too broad conclusions, directly comparing institutions or (especially) academics or saying that X person/university is the ‘most impactful’. That being said, we know that some of our users do use our data to benchmark or analyse performance against others or to create rankings and lists to help them prioritise analyses, and that there’s an increasing need for data to help demonstrate broader impacts including on policy. 

So we want policy data to be used wherever it’s helpful, including in rankings, but for it to be used sensibly… and responsibly.

Without going into too much detail about responsible metrics a good starting point if you’re thinking of producing rankings (or benchmarks) with the Overton data is the Leiden Manifesto.

But we’ve also got three general rules that are worth bearing in mind. Ranked, of course, in order of importance.

  1. Understand what the Overton data is actually saying

Remember that when counting policy citations or mentions we’re measuring proxies, and that “impact” is an elusive thing (check out our blog interview with Prof Doro Baumann-Pauly for some really interesting reflections). 

Treat the data as indicators, not metrics. Generally speaking the larger the numbers the safer you are. A single document cited in policy may or may not be meaningful (was it cited as background? Accidentally? Or was it critical to final legislation?), but ‘000s of documents probably are. 

  1. Know your sources

Overton has a broad definition of what a policy document is: something written by or primarily for a policy maker. This covers things like policy briefs from think tanks, technical reports from arms–length agencies and working papers from central banks as well as the government documents that you might expect.

That’s because evidence meets policy in all sorts of surprising ways and through all sorts of routes and intermediaries, and we want to try and capture as many of them as possible.

Depending on what you’re trying to achieve, though, you might want to include or remove different portions of the database. In the Overton app we allow you to exclude working papers or clinical guidelines from your results, for example.

  1. Be aware of biases

Though we’re the world’s largest policy database, we don’t pretend to contain all policy data ever. Overton primarily covers policy from the past ten years. We scrape policy documents from online sources, so we of course can’t collect any publications that aren’t digitised. Three quarters of the policy documents in Overton are from 2012 or later. So any analysis you do reflects only the data as indexed in Overton, rather than all data from all time. 

There’s also a geographical bias: or at least an availability one.

Though the database collects documents from 190+ countries, not all locations have equal representation. Sometimes we’re not familiar with where to collect documents from in a given country, or often they’re just not available online. This is especially an issue with sub-Saharan Africa, parts of South America and to a lesser extent China, though we’re working to address all three. As a result the list may be skewed toward countries with more robust digital footprints, and caution should be exercised when making broad global claims.

What is Overton

We help universities, think tanks and publishers understand the reach and influence of their research.

The Overton platform contains is the world’s largest searchable policy database, with almost 11 million documents from 31k organisations.

We track everything from white papers to think tank policy briefs to national clinical guidelines, and automatically find the references to scholarly research, academics and other outputs.