Wednesday, May 18, 2011

Scoring standards: Is more intense always better?

Is "intensity" in wine always a good thing?

Robert Parker certainly thinks so. He won't go above 95 points without using "intense," "powerful" or some synonym.

One might expect that at the Concours Mondial, the European wine competition that is perhaps the best-run in the world, the standards would be different. One would be wrong.

The Concours has the best rating system of any competition I've seen. They use a complex statistical system to track individual tasters, adjusting the scores relative to the group and also if ratings rise or drop relative to other tasters at different times in the day (i.e., you get tired and cranky, or get up on the right side of the bed.)

(Here's the list of this year's Best in Class winners, including a shocker: Best Red Wine is from Catalonia.)

While international in its choice of judges, the Concours is Europe-dominated, which makes sense as the consumers who pay attention to its gold medals are in European markets. (Europeans are always astounded when I explain the open hostility many Americans have for the continent; that European support for Obama in 2008 was a negative for him, for example. If I write a post like this and don't get at least one comment to the effect of, "Who gives a damn what a bunch of European wine experts think," it means I just don't have enough readers.)

So one would think that the categories would reflect the so-called European palate: Balance, minerality, acidity, that sort of thing.

Nope.



The standards are based on some developed by the O.I.V. (International Organization of Wine and Vine), and are as flawed as you'd expect something that emerges from a huge committee with five official languages to be.

Here are the 10 categories and the maximum points available for each.

Visual categories
Limpidity: 5. I don't even know what "limpidity" means, but it doesn't matter because the director encouraged us to give the maximum score to every wine in this category and that's what I did.
Sight: 5. How that's different from "limpidity" I don't know, but, ditto.
Aroma categories
Intensity: 8
Genuineness: 6. This is impossible because we don't know what the wines are until we're finished for the day. More on this below.
Quality: 16. Appropriate for it to be most important.
Taste categories
Intensity: 8. Much more on this below.
Genuineness: 6. Ditto
Quality: 22.
Persistence: 8. I first thought that meant "length," but the French head of my jury encouraged me to use it for all aspects of the finish, which makes sense; a bitter finish has to be docked somewhere even it goes on a long time.
Overall Judgment: 11. Kind of a catchall, but it does reflect the experience of wine; often a wine is more or less than the sum of its aspects.

When rating a wine, the first thing I did was quickly give the highest visual ratings; I was only going to dock a wine here if it was flawed, and we didn't get any.

The second thing I did was smell it and give it a score for aromatic intensity.

Then I tasted it, and started filling in the other boxes, beginning with taste intensity. Initially I completely rated the aroma before doing anything else, but eventually I learned that the taste could affect my opinion of the aroma. A funky-smelling wine that tastes bright and fruity has an interesting aroma; a funky-smelling wine that tastes like ass has a bad aroma.

Anyway, taste intensity, like aromatic intensity, is the easiest decision. A wine can be intensely bad, and in fact I gave one flawed wine the highest scores in both "intensity" categories.

But I remembered a time last year when I went to Total Wine in Tampa looking for a white wine to bring to dinner. My friend was making grouper in parchment, a delicate dish. The pourer was pushing a New Zealand Sauvignon Blanc that I acknowledged was potent, but it was more than I wanted. I wanted a wine that wouldn't overpower the dish, and ended with a white from the Loire. It was well-balanced, subtle and went great with the fish.

Had that Loire white been in competition, I would have given it, at best, an adequate score in both aromatic and taste intensity, and that loss of 4 points would probably have prevented a Grand Gold Medal. And yet, its intensity was actually perfect for its style.

Using "intensity" as a flavor category would seem to favor exactly the sort of heavily oaked, super ripe New World reds that Concours judges generally deplore. I don't get it.

The Concours scale and its O.I.V. precursor are based on a classical system of wine evaluation,  developed at a time when many wines were underripe. And maybe, in the backs of their minds, these are the standards that the Wine Advocate and Wine Spectator are using.

But those days are mostly over, and I suggest the time has come for new standards. No outsider can tell Parker or the Spectator what to do (don't blame me, I tried). But the Concours Mondial could take the lead in telling the world that balance matters more than intensity.

One problem is "genuineness," which should mean varietal or regional typicity. The Concours chooses not to tell you what you're tasting for psychological reasons; if you know you're tasting Turkish whites you might not be as generous as if you think they might be Italian. But it hasn't eliminated "genuineness" as a category, which often had me wildly guessing relative to other wines in the flight. It's uncomfortable, but I agree that I don't want to eliminate typicity as a category because in the real world, you want a Sangiovese that tastes like Sangiovese, not Merlot.

Here's my proposal for new judging categories:

Visual

Sight. 4 points. So few wines are visually flawed these days -- we got none out of 150 -- that there's no point in differentiating them here anymore. Moreover, I don't care much if a wine is visually flawed. If that's indicative of greater problems, they will show up in other categories; if not, I'll drink it and say, "it might be cloudy (poor limpidity), but it smells and tastes great."

Aroma

Complexity 5. Sometimes a one-note wine is perfectly adequate, but complex ones are better.
Genuineness 5. Would count more if it were easier to judge.
Quality 20. I'm giving aroma a little less than a third of the total, the same as the Concours uses now. I can drink a tasty wine with a below-average aroma; not so the other way.

Taste

Genuineness 5. Same as above.
Balance 10. Maybe this was supposed to be part of "quality," but it's worth breaking out on its own.
Acidity 5. One could argue that acidity is part of balance. From an American perspective, acidity is the single most important component, because it's California's greatest challenge. Yes, you can have too much, but usually we have too little. Wines should be rewarded for having the acidity right, and specifically penalized when they don't.
Complexity 8. I don't want to over-penalize a one-note wine's flavor for "quality." Removing complexity from "quality" allows a zero in this category without throwing the wine out of the medals.
Quality 20.
Finish 8. That's about right. Finish is important but isn't everything.

Overall impression 10. This category is all that matters to most critics -- one overall impression, and is something I blogged about earlier this week.

Of course, there's no way anybody at the Concours will pay any attention to this. Their scale was adapted from some official French thing; mine, I made up while sitting in a cafe.

But mine is better. Check it out -- an overripe, overoaked wine would be out of the medals for failing on "balance" and "acidity." Run the same wine through the Concours scale and you might have to give it a gold.

Don't believe me? I had just such a red wine -- turned out to be a Greek Wine of Drama (love that name). It was ripe, oaky and much sweeter than I wanted it to be. But there was nowhere in the official categories for me to penalize it, other than in the "overall impression." I may have given it enough for a gold though I wouldn't drink it myself.

I asked Thomas Costenoble, the director of the competition, about this and he said, "You cannot penalize wine because it doesn't fit your personal taste."

Well, I didn't. I followed the rules. But as the climate changes and wines continue to change, the rules need to change with them.

6 comments:

BikrDave said...

Thinking it would be interesting to know how scores for "intensity" correlate with scores overall, and whether wines finishing in the medals scored disproportionately high in the intensity department. Does the Concours publish any sort of overall breakdown of scores that might allow for this sort of analysis? (Not that I would have any idea how to do it, other than to eyeball winners, but the fact that it's at least possible to view score components is already a step towards transparency beyond what WA, WS and many others provide)

W. Blake Gray said...

That's a very good question! I'm going to ask it; will post an answer if I get one.

SUAMW said...

Rating comparatively based on intensity subject the wines at hand to contrast error.
Each wine should be evaluated for what aromatic, flavor and structural components it offers and their respective intensities relative to the composition of a particular wine.

As a winemaker shouted out during the sweepstakes judging at the Riverside International Wine Competition: "Delicacy is NOT a crime!"....

W. Blake Gray said...

I asked the question, and here's the answer:

"Dear Blake Gray,

The answer to your question is:
1) yes, we do keep all the individual scores and we can extract them for analysis
2) no, we don't analyze them routinely

We (Vinopres+I as an external consultant) will meet during the next few weeks and make a plan for use and further development of statistics at the CMB. If there is sufficient interest and Vinopres supports it, we could either provide some data for study or we could perform some analysis ourselves (but this would take a little while since it would have to be done in the context of a Masters Thesis and we would first have to find a suitable student).

In the past, we looked a bit into issues of individual scores (this was in the context of a student project) and I have to look at the related documents to see whether we addressed the issue you are speaking about.

If you haven't heard from me by then, you may remind me towards the end of June about your question.

Have a nice day,"

W. Blake Gray said...

After I posted the previous email, I got this as a followup. It's a little dense but the final point is interesting:

"I had a quick peak at the data from the Concours Mondial of 2010 and 2011 (checking if I could extract some data as easily as I hoped). I limited myself to the still wines (red, white and rose). As one would expect, all scoring categories are positively correlated with the overall score. Some categories are less correlated with the final result than others. The least correlated categories concern the aspect of the wine. These scores are at their maximum value unless there is a serious reason (this is what was clearly indicated at the beginning of the Concours). So, there are not many values which are different from the maximum and they are quite dispersed. As a consequence, the observed correlation of the aspect category and the total score is low.

All the other correlations of the individual categories and the final score are around or above 0.6. If we look at the category intensity (smell and taste), we see a slightly lower correlation with the total score for the intensity of smell. When we go a little more into the detail, we find that this comes from some wines for which the smell intensity was rated low whereas the other categories were rated high. Now, I don't know whether this gives you any useful information for your blog, but what may be useful to know is that we can look into the details of the data whenever we want.

Have a nice weekend,"

Nelson Abreu said...

Mr. Gray,

I understand your frustration with inconsistencies in scoring methods. Wouldn't it be nice if we all agreed on a single global standard "perfect" rating system? Your critique of the score card is valid but I find the real problem is that you are reading translated text that doesn't appear to be translated by a native English-speaking person, nor were you instructed by one when you asked for clarification. No single scoring system has been agreed upon as ideal and that one is their own, however, without clarity and instruction how does one make best use of it?

Nelson Abreu
Wine Judges of Canada