Certified Fresh?: A Call to Review Our Review Aggregators

Rotten Tomatoes
Some actual rotten tomatoes, a precursor to our more sophisticated grading systems. (Photo: Flickr)

There are plenty of places to find reviews for cultural products online, but two of the biggest are Rotten Tomatoes and Metacritic. For those not familiar with these enormously popular and influential sites, Rotten Tomatoes and Metacritic are review aggregators, meaning they compile a wide range of critics’ reviews across a given industry and assign scores that represent an overall critical consensus. And each has their own methods. Rotten Tomatoes, owned by Fandango and the leading aggregator for film and TV scores, grades work as a percentage of positive reviews. Metacritic, owned by CBS Corporation and covering movies, TV, music and video games, uses scores that are a weighted average on a scale of 0-100. The sites are extremely useful and have streamlined the way many of us choose the best things to watch, play or listen to, yet at the same time, I wonder how their tremendous influence is now affecting our relationship with art and the industries that create it. While aggregators are an invaluable tool for arts criticism, we need to rethink their implementation to avoid suffocating or misrepresenting the work they review.

Story continues below.

One of my biggest concerns with how we use aggregators is not the act of aggregating itself but how all-consuming we’ve let their scores them become. You don’t have to go to Rotten Tomatoes or Metacritic anymore to see how a work was scored, as these scores now regularly appear in many other places you may go for more information on the product. Simply searching for the title of a movie on Google will immediately bring up its aggregator scores, for example, and IMDb, Moviefone and numerous others post Metascores as well.

Aggregator scores shouldn’t be a fundamental part of a work’s identity.

While Metacritic is actually one of my favorite sites, and I use it regularly, its ubiquity elsewhere makes it much more difficult to opt out of seeing reviews if I instead want to go into something cold. As a result, I rarely watch or play something without already being prebiased with quantified assessments of how “good” the work is supposed to be. And reviews, as has been proven extensively, do influence our perceptions. This doesn’t mean I can’t form my own opinions, and I don’t always agree with the critical consensus, but it does mean that I can find myself questioning whether my response to a given work was really authentic or whether I’ve been subtly led to align myself with the opinions I already knew were out there.

This power and ubiquity is, of course, great for critics, and in a time when their role is being widely questioned, it’s nice to see them being given so much clout. But the degree to which their opinions have now become inextricably tied to a work’s identity has reached a point where it is now excessive. A work should be allowed to be evaluated on its own terms, rather than having to automatically contend with the curve of pre-established expectations. And the numbers shouldn’t be posted anywhere where people can’t also read the reviews and words of the critics themselves, to put those numbers in context.

Story continues below.

Also, while these sites are well maintained and thoughtful in their design, their scores are ultimately subjective and sometimes misleading. There is only so much information you can get from a single number, and both aggregators’ scoring systems have their limitations. A Tomatometer score may tell you how many people had a more-or-less positive experience with a film, but it doesn’t convey the degree of their opinion. Metascores are more precise, and better convey the nuance of how critics actually felt, but it can’t represent the range of those reviews, and Metacritic also doesn’t include the helpful overall summaries that Rotten Tomatoes does. As a result, it’s usually better to play the two off one another rather than rely on either one individually, as scores can vary sharply between two systems (even those they’re both aggregating similar information). Consider Christopher Nolan’s 2014 cerebral space adventure Interstellar compared with last year’s animated family musical Sing. According to Metacritic, Interstellar was the most discussed and most shared film of 2014, it only received one review they deemed “negative,” and it had seven reviews that were so positive they counted them as rare perfect 100’s. Some critics also found it lacking in characterization, and its overall score is a (good, if somewhat unforgiving) 74. Sing, on the other hand, was considered cute but lacking in substance, and has a 59. On Rotten Tomatoes, however, instead of beating Sing by 15 points, Interstellar scores lower,

Tom Hanks The Circle
Tom Hanks in The Circle (Photo: EuropaCorp / STXfilms)

with a 71% to Sing’s 73%. Alternatively, consider two movies that have the same Metascore, this year’s The Circle and 2015’s Maze Runner: The Scorch Trials, both at 43. While a 43 is clearly underwhelming, on Rotten Tomatoes Maze Runner performs slightly better, breaking about even at 47%. The Circle, on the other hand, bombs at 17%, even though most critics individually didn’t find it that bad. While I should note that I haven’t seen Sing, Maze Runner, or The Circle myself, and am just using them to make a point, one doesn’t need to have seen these films to realize how misleading an individual Metascore or (especially) a Tomatometer score can be if not cross-checked across other reviewers.

This sometimes misleading tendency of these aggregators is another reason why it seems unfair to attach movies too closely with their scores, especially because these scores don’t merely serve as references for consumers but also have a genuine impact on the industries themselves. Metacritic, for example, is extremely influential in the video game industry, where the products tend to be more expensive and consumers are more likely to consult reviews before purchasing them. Leaders in the industry have noted that best-selling games tend to receive Metascores of 80 or higher, and that sales roughly double for every five points past 80. And for movies, Shawn Robbins, chief analyst at BoxOffice.com, has identified the “Rotten Tomatoes Effect,” by which the aggregator really does impact the overall earnings of a film. These scores matter, and they deserve to. Lazy work deserves to be labeled as such, and good work should be applauded. But it also gives us more reason to question how these aggregators’ grading systems could be improved.

So how can we fix them? One way would simply be to give work more breathing room from its scores. Right now we treat Rotten Tomatoes or Metacritic scores as badges that a work then has to wear across the Internet. But if we were more subtle with them, and didn’t display them automatically in a large side window after a Google search, consumers who don’t want to be prebiased could at least have more opportunities to assess a work on their own terms. Sites like Moviefone or IMDb could also have links to Rotten Tomatoes or Metacritic without displaying the scores themselves. We could, in a sense, treat reviews like very mild spoilers, while still giving people ample opportunity and encouragement to consult reviews if they’re interested.

Story continues below.

We could also revamp the scoring systems on these sites to better reflect the range of opinion on a given work. What if, for example, Metacritic also displayed an average of the top three and bottom three scores? It’s not a perfect solution, and it could lead to more numbers than consumers are willing to sift through, but it’d be easier to see the full scope of the industry’s response. With this system Interstellar, for example, would have a score of “74 (100–38)” (not taking into account how critics are weighted), and we’d see how diverse the opinions were.

What would a score that includes more than one number look like?

This system doesn’t punish work for taking risks the same way a single number does, and at least allows art that isn’t as accessible to demonstrate its ability to resonate with certain demographics. I’m not sure what the perfect ranges would be to represent, or what the most intuitive notation would be, but it would be a step in the right direction.

Aggregators are an intrinsic part of our current arts climate, and their methods of condensing complex work into single, elegant numbers is certainly appealing. But the way we’ve begun to suffocate art under its own reviews is a problem. We need to allow art and consumers to retain some freedom from reviews, treating them as an optional resource rather than a part of the work’s core identity, and we could use scoring methods that better reflect the spectrum of responses a single work can receive. Our current systems reward a consensus of critical favor—often a good sign of quality—but can be unduly punitive toward riskier or less mainstream work. So let’s reward art that achieves high reviews, and hold lesser work accountable if it’s lazy or poorly made, but let’s also be willing to take risks on work that’s less known to us or leaves critics divided. Chances are, when you think back on your favorite cultural products, they aren’t all things that received critical acclaim. Having hundreds of critics’ thoughts at our fingertips is a significant innovation — but let’s not forget to keep forming opinions of our own.