Enhancing Peer Review

October 2, 2010

Jeremy Berg–director of the National Institute of General Medical Sciences–has recently been posting some fascinating analyses of the recently modified NIH peer review process at his blog, including the relationship among overall impact scores, percentiles, and funding decision, as well as the correlations between the various criterion scores–significance, innovation, investigator, environment, approach–and overall impact scores. This is all fascinating stuff, and Jeremy and NIGMS are to be lauded for their openness with these data. Also, I am a big fan of the new application format, scoring system, and critique format.

However, my cynical theory is that the purpose of Enhancing Peer Review was not to make peer review “better”, because it already did its job perfectly fine: indentifying roughly the top quartile of applications in any given round of review. Rather one purpose of the Enhancing Peer Review effort was to placate the extramural community who was up in arms at the peer review outcomes that were being used to perform an intrinsically impossible task: identifying the top decile of applications. (This task is impossible, because there simply are no objective differences in “quality” within the top quartile that anyone can agree on.)

Since every investigator whose grant is judged in the top quartile is outraged at the indignity of not being judged in the top decile, something needed to be done to make these investigators feel that their concerns were valued and that the system would be made more “fair”: i.e., would judge all their grants in the top decile. Of course, this is mathematically impossible. The most interesting, but impossible, analysis of the new reviewing system is not post-hoc analyses of what reviewers are doing, but rather a direct comparison of assigned grant percentiles between the old and new systems. My guess is that the old system and new system would identify the same grants as in the top quartile, and almost all the same grants in the top decile (perhaps with some small differences: there may be some investigators whose grantsmanship styles are better suited to the old or new systems).

The other purpose of Enhancing Peer Review was to dramatically streamline the system, to make the peer review process faster and easier and more efficient on a per-grant basis. I think it actually did this job quite well, and I enjoy writing grants and reviewing them in the new system more than in the old. But the only metric that can tell us if peer review was “enhanced” in the sense of “improving outcomes” is whether there would be differences in the percentiling of particular grants in the old versus the new system.


2 Responses to “Enhancing Peer Review”

  1. Spiny Norman Says:

    I think that’s about fucking right.

    I do think that young investigators (yeah, I’m using that phrasing intentionally) are going to take it in the shorts because the new format doesn’t give them enough room to show preliminary results and to fully explain their rationales. However, I doubt whether most YI’s were canny enough to know what they really needed to do in the old format anyway, so it’s probably going to be a wash.

  2. Beaker Says:

    The overarching goal of enhancing peer review is to ensure that the best projects get funded. As you observe, which grants are truly “the best” is subjective. If the same grants are getting funded under both the old and new systems, then perhaps they really are the best. Or perhaps they are not, and nothing changed.

    The difficulties in assessing this are compounded by a relatively poor system of linking scientific output and impact post-hoc to a particular funded grant. This is the topic of another post on Director Berg’s blog. Sure, if you are awarded an RO1 and publish nothing, your renewal will fail. But how could we ever know whether a grant missing the funding line by 1% was the one that would have delivered a fundamental new discovery? If one of the objectives of Enhancing Peer Review was to tilt the system towards funding more innovative and risky/high-payoff projects, then the high correlation with the “approach” score and the low correlation with the “significance” and “innovation” scores imply that this objective was not achieved. One caveat here is that the Berg data set was obtained from grants scored under the new system but written in the old 25-page format. We’ll have to wait for more data to really know.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: