A few weeks ago, I came across this. Andrew Gelman has already written many, many, many, many posts on the subject of shoddy/fraudulent analyses and the apparent inability of the traditional peer review system to act as a safeguard against the publication of such analyses, and those posts link to a lot of the other stuff people have been writing on the subject, but there’s one piece of the discussion that got only passing mention in one of Gelman’s posts (“I don’t quite see how it works but it’s probably a good idea.”), and I would like to focus on that piece here.
Niko Kriegeskorte has proposed a system of “open post-publication peer review.” The idea is to let anyone publish pretty much anything, and then let people openly comment on and assign numerical evaluations to each paper. The original paper, all comments, all responses, and all revisions become part of the permanent record of the publication, and each publication can be searched, sorted, and ranked according to whatever criteria happens to interest a person – number of reviews, average rating, average rating from non-anonymous reviews, or whatever other metadata is appended to the document.
I find this idea compelling for two reasons. For one thing, having published (and even more often, attempted to publish) in a variety of settings, I know how frustrating it can be to submit a manuscript to an academic journal, wait weeks or usually months to get a decision from the editor and reviewers, and then, when that decision (a rejection, more often than not) finally comes, see just how weak a rationale was used to justify their rejection. But more than that, I find it compelling because, at its core, it rests on the simple point that our current system enforces constraints on publication that only make sense if you assume that no one in the world has access to the Internet.
I quickly sorted through my old email and found a rejection letter from about a year ago. The rejection was in response to the submission of a little meta-analysis of dissertation quotes used to illustrate Central Asian views of Islam. We (Paul and I) wrote the paper to explore the ways scholars portray people’s religious views in research settings where large-scale research is rare. We only used dissertations because journal articles are rarely long enough to include lots of quotes, and this was just an exploratory study. We used the old official vs. unofficial (“high” vs. “low”, doctrinal vs. cultural) characterization of Islam as a jumping-off point at the beginning of the paper. Here’s what we got back:
The dichotomy in Central Asian Islam is complicated by the fact that official and unofficial was not merely a separation of doctrines and scriptural religion versus popular beliefs but also shaped by Soviet policies which prevented the unimpeded development of intellectual debates in the ‘official realm’. These issues are not at all mentioned in the article even though they make the comparison with the apparent separation elsewhere in the Islamic world problematic.
Naturally, a meta-analysis does not require the use of primary data, but it seems to me that the selection of the works upon which the analysis is based is questionable. The dissertations published between 2000-2010 were selected from a database…this implies that the selection of book/thesis length works is assumed to be representative although the screening of the works was entirely mechanical, using digital search methods. This selection method means… a number of monographs and doctoral dissertations produced outside the US within the same period have been ignored, although they constitute an important body of recent literature on Central Asian Islam. For example:
[list of a few books/dissertations on Central Asian Islam published in European countries]
None of these authors I have listed would accept the dichotomy of the official-unofficial Islam which the article wishes to question. In fact we cannot be sure to what extent the authors of the doctoral dissertations chosen as sources embrace this dichotomy; this should have been mentioned somewhere, although this would also have required the careful reading of each one of these works rather than just digital screening.
A less mechanical review of the literature could have identified key monographs and dissertations not covered by the database and would have highlighted perhaps that the simple difference in approaches to the dichotomy may have a disciplinary basis which have to do with research methods. One could also have seen that the Soviet legacy (from this list Basilov, Suxareva and Snesarev are sadly missing) of the official/unofficial dichotomy lives on in one scholarly discipline but has been questioned and refuted by scholars who work with the ethnographic method as well as by some historians as well as historians of religion.)
The article takes up an interesting question which is very relevant to scholarly debates over the meanings of Islam among Muslims in Central Asia: the question of whether the distinction between ‘orthodox’ or ‘formal’ Islam and ‘popular’ Islam, common in much scholarly work, is empirically grounded.
To be honest, I am probably not the best person to assess the paper as I am by no means an expert in the kind of analysis conducted by the authors. From my perspective as anthropologist the value of using quotations from doctoral dissertations as empirical test material is rather questionable, given that these quotations have already been processed, selected by the authors with certain aims in mind, in order to validate certain claims and, often, translated from the original languages. However, I will leave it to others, more schooled in quantitative methods, to judge whether the analysis is convincing.
As a scholar on Islam in Central Asia / anthropologist, however, my comments are as follows: The authors present a very short introduction to / review of the scholarly work done on Islam in Central Asia and the tendency here to distinguish between orthodox/doctrinal and popular Islam. However, the argument would be improved if the authors could situate it more thoroughly in the existing literature: Recently this distinction has indeed become much debated in the field of Central Asian studies; works by e.g. Johan Rasanayagam, Maria Louw, Julie McBrien and Mathijs Pelkmans have questioned its analytical value. Situating the argument better within this existing literature and current debates would make it more interesting to scholars of Central Asian Islam
Imaginative, rigorous, useful, and well-written. A fine contribution. Nice to see manuscripts submitted for review that do not require extensive re-formulation and re-writing.
Based on this input, the editor wrote:
We now have three reports on your submission. One referee (nr 3) mainly looked at the methodological aspects, and liked that very much. We agree. It is good to get submissions that show methodological sophistication. Our two other referees, however, are more skeptical with regard to the substance of the paper, both in terms of the relevance or validity of the data, the concepts, and to the engagement of other literature. To our eyes, the methodological virtues do not outweigh the substantial problems with the paper. Moreover, at the end of the day the results achieved by the thorough methodological exercise seem rather meager–do we know something we did not know before? Looking at the data-method-theory triangle, the article may be strong on the methods side, but since it has very little to offer on the other two sides, the result is somewhat shallow.
I could go into a lot of issues here:
- The first reviewer fixated on the official/unofficial talking point we had used as an introductory vignette and ignored the multiple places we mentioned that many people reject the dichotomy, as well as our actual results that suggested the dichotomy exists in the literature but is such a small part of what’s there that it doesn’t deserve the attention it does get.
- The first reviewer basically said “this paper sucks.” The second review basically said “I’m not competent to comment.” The third reviewer basically said “this paper is great!” The reviewer interpreted that as “No (all aspects). No (all aspects). Yes (methodology only).”
- The reviewers threw out a bunch of people whose work we didn’t mention, even though there is no way to mention everyone who has written on a subject; even though we explicitly included our selection criteria to allow other people to systematically build upon what we did; and even though we billed our analysis as exploratory (and advocated expanding the scope of literature included in future studies).
- In the end, the editor said it didn’t matter if the findings were valid because they weren’t very interesting.
Even in light of the paper’s many shortcomings (and it had a lot of them), I find these practices very disturbing, and not just because they were directed at my own work. But I want to focus on the more basic question here: should we really have expected any different result from a publication system that (1) is drastically limited in its ability to publish by virtue of the fact that printing journals and handling the logistics of coordinating reviews costs a lot of money, and (2) makes decisions about publication based three people’s comments, which range from a couple sentences to a few paragraphs?
To address the second part of the question first: we already have little reason to believe that reviewers evaluate a manuscript by a largely similar set of criteria. I’ve written before about the problem of assuming that an individual researcher can be trusted to offer a reasonably unbiased opinion on any given piece of research – which is something that the current system assumes. You get three or so reviewers. They give their verdict. The editor informs you of that verdict. You have no appeal. No response. Not even a chance to correspond with the reviewers. That kind of system only makes sense if we can assume that reviewers’ assessments of the quality of a paper tend to correspond, at least roughly, with the actual quality of the paper.
But I think the first part of the question is the more interesting and pressing one. When I want to buy a product and several different options are offered, I look at reviews to learn about those options. The whole product evaluation system is a mess. Some of those reviews are clearly from someone with an axe to grind. Some are clearly written by the people who sell the products that are being reviewed. Some of them are from people who actually bought the product but obviously used it incorrectly. The valid reviews are strewn in among all the garbage.
In other words, it’s exactly like the current system of peer review, except reviewers can be responded to, everyone can see what everyone else says, it’s cheaper, everything can be sorted or filtered to remove irrelevant information, it happens in real time, and there are dozens, hundreds, or thousands of reviews instead of three.
I think the awesomeness of Kriegeskorte’s idea isn’t so much the timing of the reviews as the marrying of peer review with the type of customer interface you see on sites like Amazon. Just off the top of my head, I imagine the following information could be asked from anyone who wants to review a piece of research:
- Ratings of data, methods, and interpretation, with comment fields to go along with the standardized scales.
- Reviewer’s name (optional)
- Reviewer’s qualifications (done work on same substantive topic, used same methods, worked among same population, etc.)
- Potential conflicts of interest
- Other publications that address the data, methodological, or interpretation issues of the paper.
Just based on those criteria, we could search for a paper on a particular topic, and then look only at those reviews for which reviewers gave their name, or had no conflict of interest, or had particular qualifications, or included comments to substantiate their ratings. Combine that with reviewer profiles – basically combining Amazon with LinkedIn, and each reviewer could develop personal ratings – number of papers reviewed, number of reviews contested (by authors or anyone else).
The biggest problem I see with this is that researchers could basically get all of their friends to get on and give them good reviews, or give competing researchers bad reviews. My response to that is a resounding “meh.” For one thing, professional journals are already doing pretty much exactly that. An open system would at least make it easier to call people out for it. Besides that, nothing attracts bad attention like good attention. If people pad their own reviews they will get targeted by people who get ticked off by people who pad their own reviews, and those critical reviews will tend to balance things out.
But all of that is kind of beside the point. The purpose of an open publication system wouldn’t be to ensure that only quality research get published or gets noticed. The purpose is to allow people to be educated consumers of research. Will people be able to cherry-pick results that fit their political views or personal biases? Sure. But they already do that and a system that better facilitates confirmation bias can just as easily facilitate attacks on confirmation bias. Will important work get ignored because it will be thrown in with so many other pieces of work? Sure, but that happens now, and the metadata will help people sort the wheat from the chaff better than editors and a handful of reviewers ever could.
I don’t trust myself to reliably recognize good research when I see it. There are too many things that can get in the way of a fair evaluation. So I certainly don’t trust an editor and three anonymous reviewers to reliably recognize good research. We’ve focused too long on trying to make sure that what does get published is of good quality. I think we need to stop worrying about that. Let it get published, and worry about the quality issues afterwards. That will allow a lot of bad research to get attention that it doesn’t deserve. I much prefer that to our current system, which consistently fails to allow good research to get the attention it does deserve.