Introducing ICML 2026 policy for self-ranking in reviews
By ICML 2026 Integrity Chair Weijie Su and Associate Integrity Chair Buxin Su.
In recent years, AI conferences like ICML have seen an exponentially increasing number of submissions. In contrast, the pool of high-quality reviewers has grown much more slowly, raising widespread concerns about the quality of peer review. When review scores are noisy, they can encourage a cycle in which authors submit more borderline papers in the hope that variance in review scores will work in their favor. To assist the program committee in making better decisions, ICML 2026 will implement self-ranking to identify low-quality reviews. This approach leverages authors’ insights to scrutinize review outcomes with minimal additional effort, rather than increasing the burden on reviewers.
We introduce the following policies. Immediately after the full paper submission deadline, authors with multiple submissions will be asked to rank their papers according to their perceived quality and significance. Once pre-rebuttal review scores are available, we will compute a projected score for each submission by adjusting the pre-rebuttal scores to align with the self-rankings using the Isotonic Mechanism [1,2,3,4]. Notably, a paper that is ranked highest (top-1) by all authors is guaranteed to receive a projected score at least as high as their other submissions (see details in Final policy design). We will then use the absolute value of the discrepancy between projected scores and pre-rebuttal scores as a signal: submissions with large absolute discrepancies are those for which authors strongly disagree with the reviewers. These submissions will be flagged to Area Chairs, who can decide either to recruit emergency reviewers or to re-examine the reviews themselves. This method is intended as a Pareto improvement to review quality, encouraging Area Chairs to devote extra attention to submissions that may have received lower-quality reviews.
Where does this framework come from?
The starting point is a simple observation: authors usually have the best knowledge of their own papers but are often reluctant to truthfully share their relative preferences. This idea was formalized in the Isotonic Mechanism proposed in [1,2], a computationally efficient approach to enhancing the accuracy of noisy review scores by incorporating authors’ self-rankings of their submissions. Under this mechanism, authors with multiple submissions are asked to rank their papers in descending order of perceived quality, and the pre-rebuttal scores are then calibrated to be consistent with these rankings. Papers [1,2] show that authors are incentivized to report their rankings truthfully, because truthful reporting maximizes their expected utility. Building on this game-theoretic foundation, we hypothesize that self-rankings are informative: authors have a unique understanding of their work’s conceptual depth and long-term promise. Over the last three years at ICML, the Integrity Chairs and Associate Integrity Chair conducted pilot studies to empirically examine this idea [3,4], and for ICML 2026, the resulting evidence directly informed the final policy design.
Pilot studies
The policy is based on three pilot studies conducted around ICML over the last three years, beginning with ICML 2023. All pilot studies were anonymous, were GDPR-compliant, and were conducted after the full paper submission deadline.
In the ICML 2023 pilot study, authors with multiple submissions were asked to rank their papers according to perceived quality and significance. This produced a rich dataset of self-rankings from 1,342 authors, covering 2,592 submissions to ICML 2023, along with their official review scores and acceptance decisions.
We first compared the outputs of the Isotonic Mechanism (the projected scores) to the pre-rebuttal review scores in terms of how accurately they reflect submission quality [3]. To do so, we needed a proxy for ground-truth submission quality. Since submissions typically receive multiple reviews, we used the average of the remaining scores as a proxy for the ground truth “expected review score” of a submission when evaluating the performance of an estimator applied to a single randomly selected review score. We showed that the Isotonic Mechanism substantially reduces proxy estimation errors, specifically mean squared error and mean absolute error. Moreover, the results suggest that the improvement becomes more substantial as the number of submissions of an author increases.

This dependence suggests that more author-provided rankings can lead to greater error reduction through the Isotonic Mechanism. Because the change in proxy estimation error is an unbiased estimator of the change in ground-truth estimation error, we could produce confidence intervals for the decrease in ground-truth squared error. We observed substantial decreases, statistically significant at the 99% confidence level.

Different strategies implement different ways of aggregating the rankings provided by multiple authors of the same submission. The simple-averaging strategy runs the Isotonic Mechanism separately for each author who provides a ranking, and the submission’s projected score is computed as the simple average of the resulting modified scores. The greedy strategy starts by running the mechanism for the author who provides the longest ranking, then removes that author and all of their submissions from further consideration; this process is repeated for the remaining authors and their submissions until each submission has exactly one modified score. The multi-owner strategy partitions all submissions into disjoint blocks such that, within each block, every submission shares a common set of authors, and then applies the simple-averaging strategy within each block.

We also validated our hypothesis that self-rankings capture authors’ unique understanding of conceptual depth and long-term promise [4]. Our analysis of the ICML 2023 data reveals that authors’ self-rankings are a powerful predictor of a paper’s future impact, measured by citations accumulated over 16 months. Submissions ranked highest by their authors received, on average, twice as many citations as those they ranked lowest, a trend that held for both accepted and rejected submissions, suggesting that self-rankings can not only denoise review scores but also capture a distinct dimension of impact.

The predictive power was particularly striking for identifying high-impact work: of the 22 papers in our dataset that garnered over 150 citations, 17 were ranked highest by their authors. For comparison, we demonstrate that these self-rankings are a more accurate predictor of future citations than the review scores. Stratified by review scores, higher-ranked papers consistently received more citations even when controlling for similar scores. These findings were highly statistically significant and remained robust after controlling for potential confounding factors or using other metrics of impact, such as GitHub stars.

Final policy design
In ICML 2026, immediately after the full paper submission deadline, authors will receive a request to rank their submissions according to quality and significance. We will compute projected scores using the Isotonic Mechanism (as detailed below) and, once pre-rebuttal scores are available, calculate their discrepancies.
The Isotonic Mechanism operates as follows. Consider an author who submits n papers to a conference. The mechanism requires the author to rank these submissions in descending order of perceived quality, denoted by π. Given the (average) raw review scores y for the n submissions, the Isotonic Mechanism outputs projected scores r by minimizing the Euclidean distance between y and r, subject to the constraint that r is consistent with the author’s ranking. Formally, this convex optimization problem is equivalent to isotonic regression. For example, letting y = (8,7,4,3) and (π(1), π(2), π(3), π(4)) = (1,3,2,4), the projected scores are (8,5.5,5.5,3). To synthesize rankings provided by multiple authors of the same submission, we use the simple-averaging strategy which resulted in the greatest decrease in error in our previous studies. The simple-averaging strategy begins by running the isotonic mechanism separately for each author who provides a ranking. The projected score for the submission is calculated as the simple average of these modified scores. For additional details, please see [3].
The absolute value of the discrepancy will be used to categorize submissions into a small number of buckets. We will notify Area Chairs and Senior Area Chairs which bucket the submission falls into, without revealing the sign of the discrepancy. Based on the bucket, Area Chairs can then choose to recruit emergency reviewers or to re-examine the reviews themselves.
To help Area Chairs and Senior Area Chairs visualize these signals, OpenReview will display the discrepancy category as an additional field in the meta-review in the SAC and AC consoles. In addition, the CSV exports containing submission data for SACs and ACs will include the discrepancy category as an additional column.
Participation in self-ranking is optional. For authors who do not respond, and for those with a single submission, the process for recruiting emergency reviewers or for AC re-examination will proceed as in previous years. Self-ranking is intended only to provide an additional signal and to serve as side information that can help ACs target their efforts. Moreover, by design of the Isotonic Mechanism, trivial or uninformative self-rankings, for example declaring all of one’s submissions as equally good, leave the projected scores identical to the original review scores, resulting in zero discrepancy.
The ICML 2026 policy is grounded in a clear commitment to respecting authors’ expectations of a fair process and strengthening the community’s trust in review outcomes. Self-ranking enables authors to provide truthful, adaptive feedback, and the isotonic mechanism turns that input into a more informative signal for identifying submissions that require additional attention. We therefore expect self-ranking, used as targeted side information, to make the review process more transparent, more reliable, and ultimately more collaborative.
[1] Su, W. (2021). You Are the Best Reviewer of Your Own Papers: An Owner-Assisted Scoring Mechanism. Advances in Neural Information Processing Systems, 34, 27929-27939.
[2] Su, W. (2025). You Are the Best Reviewer of Your Own Papers: The Isotonic Mechanism. Operations Research.
[3] Su, B., Zhang, J., Collina, N., Yan, Y., Li, D., Cho, K., Fan, J., Roth, A., & Su, W. (2025). The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review. Journal of the American Statistical Association.
[4] Su, B., Collina, N., Wen, G., Li, D., Cho, K., Fan, J., Zhao, B., & Su, W. (2025). How to Find Fantastic AI Papers: Self-Rankings as a Powerful Predictor of Scientific Impact Beyond Peer Review.