Digesting scientific papers is hard. Can AI help?
From ScienceMag:

One of the most important skills that scientists need—besides patience with failed experiments and the restraint to not initiate Thanksgiving dinner conversations with That One Cousin—is the ability to read and digest large amounts of information from scientific papers. I have many memories, from college to the present, of knowing that my next several hours will be spent sitting with a fat stack of journal articles, reading, absorbing, thinking, falling asleep for a bit, making a list of household chores in the margin, googling dinner recipes, and reading some more.
Most scientists do this, and we do it often. We want to learn the latest developments in our fields, or lead a journal club, or generally not sound like ignorant ding-dongs when encountering collaborators at conferences. At some point during our training, we come to understand and accept that the paradigm of “a teacher tells you all about it” is unsustainable, and we learn to teach ourselves. In fact, learning science by reading papers can feel extremely intellectually rewarding, especially when the “aha” moments lead to “what if” moments, and you not only understand something new, but you learn to think beyond what’s in the paper to the next series of logical questions.
It can also be very, very boring.
As much as we’d like to pretend that reading scientific papers is always an unmitigated delight, nope. This is why scientists will sometimes say they’ve read a paper when in reality they’ve simply read the abstract and skimmed the figures. Or they’ll cite a paper in their own writing with only a 90% certainty, based on the title, that it’s relevant. And although the abstract and title include useful information, what do you do when you want to read all of the information in dozens or hundreds of papers? Is there a shortcut between the page and your brain?
Now, thanks to artificial intelligence (AI), we can triumphantly announce that the answer is, “Sort of!”
One of AI’s strengths, allegedly, is the ability to instantly distill gigantic amounts of information into a little bitty package of highlights, perfect for perusing while sipping a nice espresso. Any time I search the internet for something now, I get a cute little AI-generated summary telling me that, whereas most customers found this cat litter to be a good value for the price, others found its ability to reliably clump somewhat lacking.
But you know this. Unless you’ve spent the past couple years imprisoned by a vengeful sentient robot, you’ve seen how AI has snuck into many of the places that used to require human cognition. You’ve probably also heard of AI’s “hallucinations,” nonsensical responses delivered with complete confidence and self-assurance—which, if you’ve ever graded oral presentations delivered by undergraduates, may not seem too unusual.
So, where does AI land when it comes to something higher stakes than cat litter commerce? Accurate AI-generated summaries of scientific papers could potentially save researchers hours of poring through papers. But if the summaries omit important bits or reach conclusions unsupported by the original papers, they could waste a lot of time and effort by pointing you in the wrong direction.
The latter is exactly what seems to happen, according to a study published in Royal Society Open Science last month. The researchers prompted 10 different AI engines to summarize the findings in 200 abstracts and 100 papers, then searched the summaries for certain types of potentially misleading generalizations. For example, converting a paper’s finding to the present tense or extrapolating a guiding action—such as changing “patients benefitted from therapy” to “patients benefit from therapy” or “therapy is recommended for patients”—transforms a verifiable trial result to what sounds like a forward-looking prescription or endorsement. In general, the summaries omitted many key details—which, one may think, comes with the territory of any summary. But they also seemed to be tuned to present conclusions as applying more broadly than what the study warranted—a flaw that the machines exhibited five times more frequently than human-generated summaries. Strangely, specifically asking the AI to be more accurate only made it less accurate, the same way that telling my kids to go to bed seems to inspire them to start a bag of microwave popcorn.
Something feels rewarding about a rigorous analysis of AI’s flaws. The conclusion that you can’t beat the good old humans bodes well for the future usefulness of the good old humans.
It reminded me, though, of the last time scientists were up in arms about ready-made summaries. Twenty years ago, when I was in grad school, I remember endless hand wringing about overreliance on a site that would surely herald the downfall of scientific research: Wikipedia.
Many saw this new site as too easy to be valid, too useful to be authentic. My classmate once showed me how he edited the Wikipedia entry for “awesome” to include his own name. Surely, we thought, a repository of human knowledge prone to this kind of adulteration can’t be trusted with anything important.
We were wrong. It quickly grew into a one-stop shop for easy-to-digest synopses of scientific topics, and many of us use it daily—with the ingrained skepticism that comes with the use of any crowdsourced repository of knowledge. In fact, this year marks the 20th anniversary of the famous Nature paper comparing Wikipedia head-to-head against the Encyclopædia Britannica and concluding—to much controversy—that the two had nearly equivalent accuracy. (This led to a scathing response from Encyclopædia Britannica questioning the study’s methodology, and let me tell you, it was a sizzling time indeed in the world of encyclopedia fights.)
Maybe the same will happen with AI. We should be careful trusting the summaries it generates of scientific literature, and we should probably go read (or at least skim) the original papers before planning our next research steps around them or citing them in our own work. But the same study a few years from now might have a different conclusion. And we all might have a better developed sense, at that time, of how to prompt AI to get the accurate results we want.
AI-generated summary of this article: A human scientist who purchases cat litter reports that AI is amazing and that we should all subjugate ourselves to its mastery.
Leave a Reply
Want to join the discussion?Feel free to contribute!