11 Oct 2011 @ 4:16 PM 

One of the most-hit columns I’ve ever written on Top Fermented was a “Beer Advocate vs. Rate Beer” column. It raises ire. Some people like the fact that I attempted to (poorly) apply statistics to compare the ratings on each site. Other people have bitched and moaned about how it’s a steaming pile of turd, which I won’t necessarily argue with – it has flaws. I’m pretty sure I even say that in the article itself.

Anyway, I see all of these links point here and I read them all and contemplate them and never really say anything because.. hey.. it’s like a two and a half year old column at this point, and it’s kind of meaningless now. Ratings systems are continually updated and people continue to use the site in new and different ways. However, the rating sites still over-appreciate dark, high alcohol, and hard-to-find ales and well-made low alcohol lagers consistently under-perform.

I’ve thought about it a lot and I am now of the opinion that wholesale beer rating is really to a point that it is no longer useful and, in fact, might even be detrimental to the market as a whole – and I don’t just mean Beer Advocate or Rate Beer, but beer review blogs, etc., and anon. The noise-to-signal ratio is just out of whack and the results are being given gravity that they don’t deserve.

Before the flames and trolls show up, let me state my case:

People rate beer by measures that are too subjective

Plain and simple. By and large, people rate beer based on whether or not they liked it, not whether or not it’s a good beer. Believe it or not those are two different things. I can’t stand Bud Light, but I won’t tell you that it’s a poorly made beer. It’s excellently made beer if you want a lite (yeah, I spelled it) lager. But it gets a 0 on Rate Beer (a 1 within the style) and a D- at Beer Advocate even though it is essentially the definition of the “light lager” style. Why? Because it lacks technical brewing skill and is rife with off-flavors? No. Because the bulk of the people who are rating it, like me, hate it. Rather than disconnecting themselves subjectively to actually answer whether or not it’s technically well-made and matches the style, they rate their own taste in graduated values of suck.

I do think that there is value in being able to have a list of ratings of beers that you have enjoyed for your own reference. It’s one of the reasons that I like Untappd – because it gives me a list of my own ratings for me to reference later. I don’t always remember a beer three months later. Have I tried this? Did I like it? 5 stars says, “Yes!” But just because I like it doesn’t mean that somebody else will. Taste is subjective. I like a really wide range of beers, but give me something with a load of Nugget hops and I will always, always hate it. 1 star-only and man did that suck. But! That doesn’t mean you won’t like it, so why should my personal rating mean anything to you?

If it’s a useless measure, we shouldn’t be using it to judge beers with.

There’s no way to tell that people are tasting good beer

And by good I mean “like the brewer intended”. Not old or oxidized or through infected taplines or in dirty, frosted glassware or drunk by a smoker or someone with an asshole for a mouth. Certainly, some people will note in the comments of their review about how it was served or what it looked like, etc., etc., but one look at the top comment under Bud Light really says it all:

…serving type was shot gunning at the football game.

Indeed, byteme94. I will now take your D+ more seriously because I know you put a lot of thought into it for those 4 seconds while it was passing through your esophagus. Was the fact that you didn’t immediately throw up what saved it from a D- or an F?

If someone is tasting a beer out of a dirty tapline and (and this is important) they don’t know what a dirty tapline tastes like, they think they just have a shitty beer and there’s no way for me, as a reader, to tell if this is in perfect serving conditions or if this is someone drinking beer out of their cat’s old food dishes before they give a beer a score (“drunk from a straight-sided shallow goblet”, indeed). I’m not going to look through 3,000 reviews. I’m going to look at the aggregate score. If the aggregate score is a composite of unreliable measures, then the aggregate score is unreliable.

There’s no way to tell if the people are good at tasting

Let’s take, for instance, Geary’s IPA in which the first review – which gives it a B (which is decent, if you consider C to be average) – mentions the word “buttery” twice. Once in the aroma and once in the flavor. He didn’t really care for the butteriness of the malt. Of course, he mentioned that he wouldn’t really expect bitterness or alcohol in an IPA, either. Now, I happen to know that Geary’s is brewed at Shipyard, and that Shipyard’s house yeast is Ringwood which has a VERY high flocculation rate. It tends to drop out of the solution really early and doesn’t really remove diacetyl (which tastes like butter) from the beer like it should unless you do some awesome tricks to keep that yeast in suspension – which Shipyard is generally pretty good at.

IPA shouldn’t be buttery. Malt does not taste buttery. This is an off-flavor. But the reviewer doesn’t know this (or that an IPA should be bitter, sadly). He just thinks (correctly) that it tastes like butter, and while he doesn’t really like it he also doesn’t know that it’s not supposed to be there at all so he doesn’t judge it as harshly as he could and maybe should. Or to look at it backwards, he is judging it as though the butteriness is supposed to be there, because he doesn’t know that it isn’t.

Is this a good, honest review of this beer? It certainly reflects whether or not the drinker likes it, but does it reflect the quality of the beer? ie – Why should this B count with the same weight as someone’s C who does know that their beer is diacetyl heavy? How do I know if the person who is reviewing the beer knows enough about the beer to give a good review? Just because you drink a lot doesn’t make you an expert. It just makes you drunk.

(I am positive that at this point in the article, at least one thread will start on a forum somewhere to discuss whether or not it matters if a beer is well-made if you like drinking it, anyway. Related: Who cares who makes your beer if you like drinking it? Answer: I do.)

The internet is untrustworthy in general

Sorry kids, but I just don’t have any reason to trust you. Just because a lot of people rate something doesn’t mean that there’s any sort of reasonable quality involved. You know that saying that’s something like, “50,000 people can’t all be wrong”? Well – actually, they can. It happens all the time.

A significant portion of this country believes that science and math are just these things that the educated elite make up to try to perpetuate grant funding because paying yourself off of grants is sooooo awesome. They believe things like vaccines are bad but polio is kinda okay. They believe that man and dinosaurs used to co-exist. Why on earth should I trust you, the internet, to know enough about beer to give me a decent recommendation if you can’t get broad “society has moved on” issues correct?

In Summary

Fact: You can’t measure something with an unreliable tool. If I’m allowed to make my own ruler that just has however many inches I want on it at whatever random intervals, I can use it to build the same thing every time. But as soon as I give you my plans you are up a creek without.. well.. a ruler. Have fun defining that cubit, bucko, because I measured it using MY forearm, not yours.

There’s no good way to cut through the noise of beer reviews to find out which ones are worth paying attention to and which ones aren’t. Since there’s no way to calibrate the tasters to make sure that they’re all tasting with the same objectivity, then there’s no way to say that any given set of ratings is even reasonably reliable and I won’t waste my time with them. Until we have some sort of Cicerone-weighted rating system or something like that, I’m calling shenanigans on beer rating, especially wholesale ratings sites like BA and RB. Their data is no longer worthy of consideration, by my estimation.

Make your own ratings and decide what you like for yourself. It’s far more valuable in the long run.

These ratings are being put forth as guides for consumers

Let me quote something to you from the comments of a blog that I ran across that I’m pretty sure sums up common sentiment. I know that I should quote who it’s from, but I don’t know them personally and I don’t want to get into any sort of pissing contest. This quote is in reference to a post recommending shelf tags from Rate Beer and Beer Advocate in retail establishments, much like you would see shelf tags from, say, Wine Spectator.

I do appreciate that the rankings are from a consortium of dedicated drinkers compared to wine, which historically was dominated by one individual or several publications.

Indeed. You know what I hate? Being able to make informed decisions based on reliable, consistent data. What I prefer is to make random guesses based on completely unreliable anonymous data. I mean – who needs Consumer Reports and a trained panel of experts when I can get a product rating from BoobLvr67?

That is the equivalent of trusting anonymous online ratings for beer (or anything, really, but let’s stick on topic).

What I’d Like To See…

…is some sort of rating system from people who are actually known trained tasters – Cicerones and/or BJCP judges – with ratings ranked in importance based on how skilled they’ve shown themselves to be. That would be better information. There’s still individual taster differences, but at least those tasters have been moderately calibrated. At least there’s a starting point beyond, “I signed up for the website.”

That’s a rating site I’ll trust, and those are shelf tags I want to see in retail establishments. Until we can get there, I’m dispensing with wholesale beer ratings in general.

Tags Tags: , , , ,
Categories: appreciation, blog, industry, op-ed
Posted By: erik
Last Edit: 11 Oct 2011 @ 05 00 PM

EmailPermalinkComments (54)
 24 Jul 2009 @ 3:29 PM 

Earlier this week, I had my first try of Westvleteren 12, the so-called best beer in the world. No doubt, it was awesome; indescribably wonderful. When I checked later on, though, I noticed that while it was listed #1 at Beer Advocate, it was listed #2 at Rate Beer. Interesting.

It got me to thinking about the differences between the two sites and how much they agreed with one another. I started to take a closer look at what was listed at both sites.

As Andy Crouch noted earlier this week, there is a distinct lack of lagers on each of these lists, and an abundance of barrel-aged and/or hop heavy and/or alcohol heavy offerings. They’re also both heavy in rare, small-run, and hard-to-find beers. I suppose it’s all very American. Bigger is better and if it’s hard to get it must be awesome. Sounds like a recipe for eBay, if you ask me. But that’s not my focus today. That’s for my “please make more session beer” column later.

What I found fascinating was the agreement between the two lists. First of all, I found it interesting that more than half of the beers appearing on one list do not appear on the other (52). In Rate Beer’s case, 6 of the Top 10 beers they have listed do not appear in Beer Advocate’s Top 100 whatsoever. Only 1 of Beer Advocate’s Top 10 does not appear in Rate Beer’s Top 100.

So I cut myself down to looking at only the 48 beers that appear in both lists. Of those 48 beers, there is very little close agreement. Only one matches right on. Pizza Port Cuvee de Tomme ranks at #95 on both lists. The next closest agreement is the aforementioned Westvleteren 12. Only 27% of the list (13 out of 48) were in what I would consider close agreement (within 5 places, plus or minus, of the other list), whereas 38% of the list (18 out of 48) were more than 20 places apart.

I also threw a couple of scatter plots together.

Beer Advocate vs. Rate Beer

They’re both the same scatter plot, sorted two different ways. Scatter 1 is sorted by BA rank (thus stripe of blue up the middle) and Scatter 2 is sorted by RB rank (thus the strip of red up the middle). You can see from these that, of the beers that both sites ranked in the Top 100, Beer Advocate tended to rank the beers higher (lower in number: Rank 1 = The Best).

The number of times that the following words appear in both lists combined (if a beer appears on both lists, the word was counted once):

Bourbon: 10
Barrel: 16
Aged: 15
Imperial: 19
Stout: 31
Ale: 10
IPA/India Pale Ale: 7
Black: 7
Hop/Hoppy/Hoppiness, etc: 6
The suffix “-ation”: 8
Lager: 0

You’d almost think that stouts, and especially bourbon barrel aged ones were the most popular craft beers on the market, and not IPAs.

What final conclusion can we draw from all of this? It’s hard to say. Since they have two different ranking systems (5 point scale vs. 100 point scale) it’s difficult to draw any specific comparisons. Mostly, it’s an interesting look at the tastes of the user base at both sites. I wonder how many people rate at both sites and how their ratings compare given the different point systems.

I also put both lists together (where the beers match) and came up with a mean average of scores to give the overall Top 48 beers. Here’s the list:

BA Rank RB Rank Mean Rank Beer
1 2 2 Westvleteren Abt 12
5 3 4 Three Floyds Dark Lord Russian Imperial Stout
2 16 9 Russian River Pliny the Younger
7 13 10 Russian River Pliny the Elder
12 8 10 AleSmith Speedway Stout
15 7 11 Three Floyds Oak Aged Dark Lord Russian Imperial Stout
11 15 13 Rochefort Trappistes 10
4 24 14 Three Floyds Vanilla Bean Barrel Aged Dark Lord Russian Imperial Stout
10 22 16 Westvleteren Extra 8
16 19 18 Lost Abbey The Angels Share (Bourbon Barrel)
3 34 19 Deschutes The Abyss
24 14 19 Three Floyds Dreadnaught Imperial IPA
18 25 22 Surly Darkness
25 20 23 Bells Hopslam
8 40 24 Founders Kentucky Breakfast Bourbon Aged Stout
21 28 25 Stone Imperial Russian Stout
26 23 25 Port Brewing Older Viscosity
23 27 25 Russian River Consecration
33 18 26 AleSmith Barrel Aged Speedway Stout
17 39 28 Dieu du Ciel Péché Mortel
20 37 29 Russian River Supplication
36 31 34 New Glarus Belgian Red
19 50 35 Founders Breakfast Stout
6 65 36 Portsmouth Kate The Great Russian Imperial Stout
43 30 37 Struise Pannepot
22 52 37 St. Bernardus Abt 12
30 49 40 Russian River Temptation
69 12 41 Lost Abbey Isabelle Proximus
27 69 48 Firestone Walker 12
37 61 49 AleSmith IPA
39 60 50 Kuhnhenn Raspberry Eisbock
49 54 52 Lost Abbey Cable Car
78 29 54 Great Divide Oak Aged Yeti Imperial Stout
53 64 59 Stone Brandy Barrel Double Bastard
44 88 66 Cantillon Blåbær Lambik
54 83 69 New Glarus Raspberry Tart
46 94 70 Ayinger Celebrator Doppelbock
62 78 70 New Belgium La Folie
52 98 75 Surly 16 Grit
75 77 76 Stone Ruination IPA
60 93 77 Tyranena Devil Over A Barrel
83 71 77 Southern Tier Choklat
58 97 78 Russian River Beatification
81 89 85 Oskar Blues Ten FIDY
80 92 86 Ølfabrikken Porter
86 90 88 North Coast Anniversary Barrel-Aged Old Rasputin
97 91 94 Struise Black Albert
95 95 95 Pizza Port Cuvee de Tomme
Tags Tags: , , , , , , ,
Categories: industry, marketing, media
Posted By: erik
Last Edit: 24 Jul 2009 @ 03 35 PM

EmailPermalinkComments (44)
\/ More Options ...
Change Theme...
  • Users » 182616
  • Posts/Pages » 204
  • Comments » 2,778
Change Theme...
  • HopsHops « Default
  • BarleyBarley


    No Child Pages.


    No Child Pages.


    No Child Pages.