Academics’ reading lists are increasingly directed by algorithms. But are the recommendation services of platforms such as Google Scholar, ResearchGate and Mendeley distorting science? And might AI ultimately lead it to a disastrous echo chamber? David Matthews reports
When, in January, armed rioters stormed the US Capitol building carrying Confederate flags and calling for Vice-President Mike Pence to be hanged, long-time observers of the internet saw it as the grim conclusion of years of algorithm-fuelled online misinformation and hate.
Platforms such as Facebook, Twitter and YouTube, they alleged, have sought to make themselves even more addictive – and even more profitable – by feeding users the most outrageous and compelling content, up to and including risible conspiracy theories such as QAnon (the idea that a satanic cabal of paedophile Hollywood actors and Democratic politicians conspired against the presidency of Donald Trump). In 2018, YouTube was already being dubbed “The Great Radicaliser” for using its autoplay function to smoothly lead viewers down a dark rabbit hole of clips, starting with a Donald Trump rally and culminating in full-blown Holocaust denial.
Radicalised hordes of Trump supporters wearing face paint and horns might at first sight appear a world away from the academic milieu. Yet a small but growing number of scholars are sounding the alarm over the fact that the research academics see is also being determined by algorithms, through search and recommendation tools such as Google Scholar, ResearchGate and Mendeley Suggest. Of course, this may be a good thing – algorithms could make it easier to uncover hidden research gems in neglected journals and allow overwhelmed academics to sift through an ever-growing torrent of new articles. But if machines, not people, decide which articles academics read next, critics fear that this will have profound consequences for scientific consensus and discovery.
“We can actually learn from what happened on Facebook and Twitter and other social media. Science is not immune from all of these things,” says Peter Kraker, founder of literature mapping tool Open Knowledge Maps and a former open science researcher at Graz University of Technology. “Everything that happened on Facebook can also happen with these academic social networks.”
In a pre-digital age, academics discovered new research through conferences, tip-offs from colleagues and printed journals – either specific to their subfield or those of broader interest, such as Nature and Science. Scholars still use these channels, says Kraker. And not all digital alerts are algorithmic: some are just feeds of new articles from a particular journal. But although detailed usage data are scarce, it’s a “fair assumption” that almost every researcher has at least some kind of automatic alert set up, Kraker thinks.
And just as Google dominates online searches in most countries, it appears to be dominant in academic searches, too. One 2017 global study of usage by early career researchers found that Google Scholar and Google’s main search engine were “universally popular irrespective of country, language, and discipline”. In the US, Google Scholar was particularly well used, with two-thirds of early career researchers saying it was their top source of scholarly information.
Google Scholar does not only return search results, however. It also recommends new papers through its alert system. It has this in common with a number of scholarly platforms: ResearchGate, Mendeley and Semantic Scholar also offer both a way to search and a recommendation tool. And although those two functions are distinct, they are both algorithmically driven ways to find new articles.
When a researcher searches for a keyword in Google Scholar, for instance, it returns a list of papers that, by default, are sorted according to “relevance” – a seemingly simple word that opens up a host of questions about how exactly it is defined. Google offers a brief public explanation of its approach: it weighs “the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature”. It is this last part of the equation – boosting papers that have lots of recent citations – that has a number of academics worried.
“It’s a sort of rich-get-richer effect,” says Katy Jordan, a specialist in digital scholarship at the University of Cambridge. In prioritising research that is already popular, scholarly platforms risk repeating the feedback mechanisms of social media, where a minuscule fraction of content goes viral but the vast majority is all but ignored, Jordan thinks. “Trying to recreate those dynamics [in science] risks prioritising a small proportion of things,” she says.
Academics have always been more inclined to cite previously popular articles, of course. But algorithms risk exacerbating that tendency, Jordan argues. Not only that, but given there is already bias towards citing academics who are male and from high-income countries, the use of citations to help calculate which papers to recommend carries a “risk of compounding the inequalities that are already baked into academic publishing”, she warns.
One study from 2016 found that an increasing share of citations is accruing to older articles. It suggested that this could be because of a feedback loop generated by the appearance of these papers at the top of Google Scholar searches: an effect dubbed by the study’s authors as the “first-page results syndrome”.
By contrast, more traditional search tools, such as Scopus, the Web of Science and library catalogues, sort their results by default according to how recently the papers were published, with the most recent at the top.
“We know that some papers get cited because they are highly cited, either because it is a disciplinary norm or because they are easier to find in search engines,” says Mike Thelwall, professor of data science at the University of Wolverhampton. “I think that academic search engines exacerbate this problem by making highly cited papers easier to find, but they have not created it.”
A spokeswoman for Google insists that its recommendation system “casts a wide net to identify papers that are likely to be of interest to the researcher”, using Google Scholar’s “comprehensive indexing” system (Google Scholar is by most estimates reckoned to have the world’s biggest underlying database of academic papers – close to 400 million in one recent estimate).
Another concern about academic search engines is that some of them, Google Scholar included, violate fundamental principles of science: reproducibility and transparency.
After testing 28 academic search systems, two bibliometrics experts, Michael Gusenbauer and Neal R. Haddaway, concluded that half are unsuitable for conducting systematic reviews of the literature (this is in contrast to “lookup” searching, when an academic needs to track down a specific paper). In a paper published last year in the journal Research Synthesis Methods, they single out Google Scholar in particular as “inappropriate” as a “principal search system”. This is partly because it inexplicably returned different results at different times to the same query in the same circumstances – although the Google spokeswoman insists that “different users using the same query at roughly the same time will see the same set of articles”.
The broader problem, say Haddaway and Gusenbauer, is a simple lack of transparency about how and why platforms recommend one paper over the next. “We don’t know how Google Scholar is providing results,” says Haddaway, a senior research fellow at the Stockholm Environment Institute. “We don’t know the ranking.”
Platforms do often provide public descriptions of the factors they take into account when recommending or ranking papers. But this isn’t the same as releasing the full underlying code so that specialists can scrutinise exactly how the sausage is made, critics say.
“There’s the methodological point that if you’re using [search engines] as an information source, you need to be crystal clear about how things have been found,” says Cambridge’s Jordan.
And that is particularly true given that search algorithms will shape scientific discovery in a “very fundamental” way, according to Björn Brembs, professor of neurogenetics at Germany’s University of Regensburg. “At the very minimum, the code needs to be open and verifiable,” says Brembs, who also campaigns for open access. “And it needs to be substitutable, so if you don’t like this one, you can have an interface that allows you to replace one algorithm with another.”
In some cases, the underlying code is kept under wraps because it is a valuable commercial secret. For Connected Papers – a new, freely available but for-profit literature-mapping tool – the algorithm determining how papers are related is its “core value”, says Alex Tarnavsky Eitan, a doctoral student in electrical engineering at Tel Aviv University who is one of the co-founders of a company that grew out of a “weekend” project.
As other aspects of Connected Papers, such as the community around it and the user experience it offers, become more valuable, perhaps “we’ll get to the point where we can release the algorithm”, Tarnavsky Eitan says. But, for now, he and his co-founders don’t want to “shoot ourselves in the foot” by releasing their secret sauce – although they do publish a description of the algorithm on their website.
But the problem of transparency goes even deeper than commercial considerations.
Some academic search engines, such as Semantic Scholar, use a form of artificial intelligence called a neural network. Crudely put, such programmes mimic the structure of the human brain. However, even when – as in Semantic Scholar’s case – the code is open source, it is fiendishly difficult to discern why a neural network has spat out a particular answer – a problem that has spawned an entire research agenda called “explainable AI”.
The inability of such recommendation systems to explain their workings is “absolutely” a concern, says Dan Weld, Thomas J. Cable/WRF professor of computer science and engineering at the University of Washington, who helped build Semantic Scholar’s recommendation systems. Semantic Scholar’s similarity algorithm has been published and is open source, and the platform is working on ways to explain to users why it has recommended a particular paper, he says. But the neural network computes papers’ similarity across hundreds of dimensions, making it hard for it to make plain why it made a particular connection. “There aren’t good English-language words for those dimensions,” Weld says. “By definition, [any explanation] is going to be incomplete and inaccurate.”
Semantic Scholar is developed by the Seattle-based not-for-profit Allen Institute for AI; Weld says one of the reasons he was attracted to work for the platform was “the ability not to be bound by concerns of profit or loss, but to build the best tools possible”. But does the commercial raison d’être of most search and recommendation tools mean that they will ultimately put profit before researchers’ best interests?
Critics argue that big publishers have a particularly acute conflict of interest because they own both search tools and the journals they recommend. It’s as if Facebook owned a large chunk of the world’s media – could we trust the firm, in such circumstances, to not bump up its own newspapers in its news feed?
“If there is a monetary advantage of guiding users to their own content, then they will do so,” says Brembs. “This doesn’t take an expert to guess.”
However, to be fair to publishers, there is no evidence so far that their search tools are prioritising their own content. And a spokesman for the publisher Elsevier insists that “which organisation published an article” is not a parameter that its Mendeley reference manager takes account of in its “suggest” function, which recommends new research. “It doesn’t ‘know’ which publisher or society published a particular article,” the spokesman says.
What’s more, although Google and Microsoft (the latter created the recently defunct Microsoft Academic search tool) are some of the most profitable companies on earth, their academic tools have the character of curiosity-driven side projects, observers say – they aren’t yet being run to make money.
“Microsoft and Google are mostly, I think, acting in a public spirit. I don’t think Google’s making a whole lot of money off Google Scholar,” says Semantic Scholar’s Weld.
This lack of profitability may bring its own problems, though. When Microsoft announced in May that it would shut down Microsoft Academic (and its open access underlying map of papers, on which other services are built) at the end of the year, some campaigners said this proved that academia cannot rely on the benevolence of tech giants for crucial search infrastructure.
The core reason why Facebook, Twitter and Google-owned YouTube have proved such addictive, unregulated fire hoses of distraction is that they are all competing for attention in order to bring in advertising revenue and harvest data.
Advertising undergirds the business model of some academic search engines, too: most notably ResearchGate, the recommendation platform that most closely resembles a social network. The company acknowledges that there is a short-term pressure to maximise attention to quickly boost advertising revenue. However, “optimising for attention would in the long term be quite detrimental for us”, says Holly Corbett, a senior data scientist at ResearchGate. “We don’t want to make people rage-click…we want to make scientists more productive.”
If users reported feeling guilty for having spent too long on the site – a feeling all too familiar to social media users worldwide – this would be a “red alert”, indicating to the company that it needed to change how the site works, she adds.
ResearchGate does sometimes witness the equivalent of a viral, clickbait news article, Corbett admits. “Occasionally we will see floods of traffic if someone has published a Covid study, for example, that says something like ‘vaccination doesn’t work’: something that’s of potentially dubious quality,” she says. But, in general, the site is engineered to avoid an “avalanche of attention on one thing” because it recommends papers that are similar to what the user has clicked on or downloaded before, rather than less relevant but more popular papers.
ResearchGate also explicitly tries to preserve a human element to discovering new research. In addition to algorithmically recommending papers, users see what colleagues in their network have posted. The idea is that these tip-offs keep the “wild card” nature of discovery alive – the equivalent of flicking through a generalist paper journal. This addresses a worry that the engineers designing AI recommendation systems are openly fretting about: that too laser-like a focus on related papers will end the “serendipitous” nature of scientific discovery.
There are utopian and dystopian visions of where AI-driven recommendations will lead science – but they may not be mutually exclusive.
The dream is to end the sense felt by almost all researchers of drowning in a sea of new papers. Working as a biomedical researcher 10 years ago, “my constant experience was just being underwater”, says Corbett. “I used to have a printed-out stack of publications that I would have to read at some point but would never read and would always feel guilty about.”
The promise of AI-assisted recommendations is simply that “you’ll spend less time staring at a big stack of papers and feeling really bad”, she says. “That’s my hope.”
The nightmare, by contrast, is a future in which recommender systems become clever enough to create the equivalent of political filter bubbles in science, feeding researchers only the papers that confirm their beliefs, trapping them forever in existing paradigms, leading to scientific stagnation.
It is true that science has always had rival “invisible colleges” and “filter bubbles” of scholars who reinforce each other’s beliefs, says Cambridge’s Jordan; this was explored as long ago as 1989 in Tony Becher and Paul R. Trowler’s book Academic Tribes and Territories.
But as technology advances, making it increasingly possible to recommend papers that confirm users’ beliefs rather than merely matching their existing interests, some observers worry that this blinkering tendency will be exacerbated. Scite.ai, for example, founded in 2018, uses artificial intelligence to classify whether a paper is cited in a supportive, neutral or contrasting way – in other words, allowing users to see whether an article is heavily cited because people agree or disagree with it. And it is only “a question of time” until the likes of Google Scholar know academics’ views, predicts Brembs.
So will the filter bubble dystopia come to pass? “I see no reason why it shouldn’t,” says Open Knowledge Maps’ Kraker. Reinforcing an academic’s core beliefs about a research paradigm would be a powerful way of hooking them on a particular site, he thinks.
Above all, academics should “absolutely” worry that the process of research could be transformed by algorithmic recommendations as radically as political discourse has been warped by Facebook, Twitter and YouTube, he urges. The phenomenon is less advanced in scientific than in social media, he concedes, but it is encroaching and the filter-bubble effect requires immediate attention.
“We should be very careful and not assume that there is a class of humans that is immune,” he says. “Sometimes I get the sense that researchers think of themselves like that.”