Session 3: Analyzing Confusion
Introduction: Laura Heymann
What is confusion? Leads to dissatisfaction, leads to sales diversion, leads consumers to think differently about the TM owner, or what? Then we have the question of why the law should care about those kinds of confusion and not others.
Flexibility: 9th Circuit’s Network Automation case says the factors are flexible and not hoops to jump through or a rote checklist (though Network Automation says district court erred by failing to consider the right ones). Grey market goods = courts often abandon the factors. Is confusion what we’re testing for in those cases anyway? Third Circuit has a 2010 case saying a district court can decide that certain factors are unhelpful, but it has to say explicitly why it’s not using those factors and the court of appeals can reverse if it hasn’t or if it’s weighed the factors inappropriately. Focus on particular examples might help.
Is considering proxies the right thing to do at all? If we’re supposed to focus on likely purchasers, should we do surveys in every case rather than guessing?
Primary Discussants: Barton Beebe
Bone’s work on the historical development of the multifactor test is very helpful. How does it work now? We should drop the quality factor, which has no present function, and intent. An opinion in the 2d Circuit by Judge Leval would be required to reform the factors. Instead, we have a case like the Virgin Wireless case, which says the intent factor isn’t meaningful but we apply it anyway.
Reforming the multifactor test: should be empirical in nature. All sorts of policy goals have wound their way into the test—e.g., encouraging parties to adopt inherently distinctive/fanciful marks, which we promote by theoretically giving more protective to inherently distinctive marks. Ridiculous, since the test is about factfinding. You can adjust the level of confusion required to trigger an injunction in light of these policy goals, but the test itself shouldn’t be distorted this way.
Consumer sophistication is hugely underrated. Should be one of the most important factors, and it’s built into all the others. Case about nuclear reactors: engineers are unlikely to buy the wrong reactor, if they’re not Homer Simpson. Sophistication will be more important online—Network Automation is a case about sophistication, driving the entire opinion—consumers using the internet are familiar with the medium and we can’t assume they’re easily misled.
How does the test work in counterfeiting? Post-sale? Parodic speech? Sometimes IIC comes as a ninth factor; sometimes the court starts with IIC and then goes through the factors; tends to wander all over the place. Reverse confusion: various factors flip around, which is sensible, but the test hasn’t been adapted explicitly enough to fit with these other situations like post-sale confusion or parody.
Big question: do the factors drive the outcome or when the court describes the factors is this symptomatic of the court’s initial decision? Posner’s approach to multifactor tests is to ignore them and write a common law opinion—very impressive; an appellate court judge can get away with this, but not a district court judge, so what should we do for them?
Are certain factors necessary? A multistep test not just a multifactor test? Empirically, the multifactor test does have some multistep elements, and maybe those should be encouraged—a gatekeeping or fast track approach.
Keep in mind some judges come to TM for the first time and have no idea what infringement is about. For us it may be tacit knowledge with codification unnecessary, but for error costs a multifactor test might be helpful even if specialists find it unappealing.
On the other hand, tinkering is limited—do we still have Ralph Sharp Brown’s courage to indict the entire system?
Bob Bone: Why would you have a multifactor test? Courts have largely reified it. You might focus on a test if you really didn’t know what you were trying to do, or you thought you had a broad tort and couldn’t predict all the covered situations. Could have a set of broad policy objectives, but it’s not clear judges would be able to get it right, so we provide factors.
We should take a closer look at the test in light of the normative goals in particular situations. He doesn’t agree with Beebe that the test is empirical. There are normative elements, which got packed in there because we were focused on the test and not on the objectives sought by the test. We are interested in erroneous information about the quality or characteristics of the products or its seller that’s likely enough to cause serious harm for us to intervene. That’s the big picture, but not the only thing we might want to do with TM law. Intent to deceive has a moral force in itself. Without a likelihood of confusion test, the inquiry becomes different. Assemble your goals, then have tests that fit those goals. If intentional deception is a moral wrong, then have a set of rules for dealing with that; need not be likely confusion. May want a materiality requirement to ensure the deception is about something important. Treat those separately from other types of conduct. Brand as personality approach will merit a set of different rules. (In my opinion those should be called “dilution” and kept separate from confusion.) If you’re interested in consumer autonomy you’d have different rules.
We see a judge trying to do right in a particular case, but may be creating defenses that are so difficult to win that they chill legitimate activity.
IIC: can understand why legal intervention might be desired in some cases. But the game isn’t worth the candle; if it is, it ought to be broken out and done separately from likely confusion. Don’t change the factors. Approach it separately. If we come up with other factors, ok.
Post-sale confusion: There may be some reasons for post-sale confusion, and there may be cases in which the goal of the law will be triggered. Again: is the game worth the candle? He’s not sure of the answer, but doubts it will be that serious. Leaves the prestige goods/Veblen goods argument. Again, treat it separately and break it off. (I would say again: dilution if you want it.)
Reverse confusion: he’s never understood it. Of course you change the factors—this is not rocket science.
Gay Toys--licensing/merchandising of toys based on TV shows: District court sees toys as a subsidy for the TV industry, whereas the Second Circuit says that confusion is the key. Second Circuit of course wins, mandating that we assume consumers are thinking about something they might not be thinking about. He’s sympathetic with the district court’s point that if this is about subsidizing the TV industry then that’s what we should focus on. (A copyright type interest.)
He’d keep the intent factor, unlike Beebe, but focus on whether that could lead to reputational harm.
Lemley: Sees a set of moves around uses or nonuses of test that are oddly outcome-determinative. This may be descriptively accurate but normatively troubling. We have bridging the gap factors that don’t seem to reflect how cases are decided, so get rid of it. We have causes of action we want in the law that don’t seem to map to the factors, so get rid of them and have a new test. If this were descriptive that would be fair: courts are not really applying the multifactor test. Should we give up on a test that could have normative and not descriptive content? If the p’s theory of confusion makes nonsense out of factors like bridging the gap/consumer confusion, or if IIC can’t logically fit into the multifactor test because it doesn’t survive 5 or 6 factors, maybe that’s an indication that the cause of action is the problem because there were reasons the factors got in the test in the first place. We care about consumer sophistication for good reasons, and ignoring them suggests the results are suspect.
Brown’s 1948 article on TMs and advertising was something he wrote in his first years and may be the work of a junior scholar trying to take down the whole edifice.
Litman: talking about how to read the factors and picking off low-hanging fruit may actually prove influential. One of the advantages of the test in the 1980s was that courts felt they didn’t understand the markets in which these disputes arose, and the test made results seem tailored to the markets. Judges may not need the same help today. But genuine empiricism is expensive. She tells students that if they see a case in which every Polaroid factor points in the same direction, the judge is cheating, and if we could convince courts that was true we’d pick off some of the low-hanging fruit. Conviction in result may decrease care in analysis.
RT: One thing I can think of is that we could, as in false advertising, divide infringement into two types: double identity (explicit falsity) and different mark/different goods, which would ordinarily require extrinsic evidence of confusion. I’d rather see greater flexibility in false advertising, but defetishizing the multifactor test at the same time would be a good idea.
With respect to Beebe’s point about how well the system works for an unfamiliar judge: University of Alabama Board of Trustees v. New Life Art Inc., now on appeal, is an interesting case in this regard. The problem there wasn’t the test, it was the defenses. Good faith flailing resulted.
McKenna: We haven’t yet agreed what kind of confusion should be actionable, which is a prerequisite for a test. The multifactor test is designed for source confusion of noncompeting goods, and now we have 50 years of courts saying “confusion of any kind is actionable.” Those two things don’t fit but courts have been applying the test anyway. Maybe shunting different types of activity off into separate areas has the virtue of shunting each test into what it’s supposed to be doing.
Quality and intent: Will either make the defendant lose or irrelevant. Quality: if poor, favors plaintiff; if good, irrelevant because the plaintiff has the right to control the quality. We should just take it out of the test if it can’t genuinely tip in either direction.
Double identity: post-sale confusion is a double identity situation. But we might not think there’s confusion there—might not be a good proxy. Maybe that’s the way to slice it up presumptively, but not always. [I think the benefits might be worth the costs; I do think some kinds of post-sale confusion are actionable, and honestly I might be willing to throw copiers of Veblen goods under the bus to help other people using TMs in different ways.]
Goldman: the vast majority of the cases he reads get to the right result in his opinion, though he’s no fan of the multifactor test. Goes to the question of errors. Maybe the errors are acceptable. It’s the cases we aren’t seeing that are the problem. Can we expedite/improve cases in which the factors/uncertainty about them are leading to problems? TM bullying/threats cause of action, discussed last year.
Burrell: TM owners often claim multiple kinds of confusion: sale, post-sale, IIC—most plaintiffs would respond by trying to bring themselves under every head at once, so splitting might not be that helpful.
Grynberg: what’s the role of the multifactor test in catching errors? Would it be harder to catch errors without the structured test? The role of the test in creating transparency for a reviewing court. Rogers test: throws multifactor test out the window and courts look at whether anything was done to mislead. Sense that courts generally get those right; once courts see the defendant’s interests they are better at weighing both sides’ interests, and also perhaps courts are okay at assessing confusion without guidance from the factors. Potential tradeoff then between accuracy and transparency, incommensurables that are hard to weigh.
Dinwoodie: Bone thinks normative work can be done in the factors, but Dinwoodie would like to do the empirical work and then isolate those normative values, because courts face up to the tradeoff more honestly when the values are isolated. When submerged within confusion, there’s a pro plaintiff bias—don’t think about competition or expression. Rogers context hits them in the face.
Bone: doesn’t disagree that it’s useful to separate those things, but thinks that whatever the test we have will end up with normative components.
McGeveran: last year we talked defenses, and kept saying that the problem was confusion. Now we’re talking about how the multifactor test is getting freighted with other considerations and how we want those separated—defenses. That would suggest that more robust defenses could help if confusion wasn’t the end of the story for liability. Or it could mean we need better structuring of liability to deal with these multiple considerations.
Dinwoodie: normative work can be unrelated to defenses: whether you want to incentivize the creation of inherently distinctive marks, for example.
Dogan: Worries about errors—both normative and factor-driven. Also worry about predictability. When advising clients, the fact that we have different outcomes with respect to very same/similar marks ought to trouble us. One of the problems is lack of predictability even absent systematic errors. That can skew the law on its own.
McKenna: attracted by the idea that likely confusion should be separated from “how much confusion is required?” That would allow you to trade off expected harm against the benefits of the activity. But we’ve never had much sense of the quantum of confusion required more than zero. Steinway involved very low level of confusion and IIC, which taken together produces virtually zero harm—that’s too far. Could we agree on quantitative levels of confusion that mattered?
Janis: dissatisfaction with confusion: is that really dissatisfaction with confusion as a concept or is it that courts set the bar too low?
McKenna: where we have greater doubts about harm from confusion, we’d want more evidence of the quantum of confusion.
Leaffer: Noncompeting goods cases—level of confusion is too low. Something about the multifactor test isn’t working.
Lemley: quantum of confusion point is important. Not an obvious reason it centers around 10-15%. One story: if I assume everyone confused is making a mistaken purchasing decision, then a 10-15% loss could be quite significant. Leads the way to ask: how many of those confused consumers convert to purchasing decisions? This will differ between straight-up confusion and IIC or post-sale confusion. Maybe you need 80-90% confusion before post-sale confusion leads to significant change in sales. Injury and confusion on the other side should also be considered. Genericide: we worry about harm on both sides; we should consider more generally how many people are benefited by a use.
Dinwoodie: On distinctiveness/protectability, Grupo Gigante: our prescriptive commitment to territoriality leads us to require more in the way of secondary meaning before protecting a foreign mark. Could do the same thing more generally. Same in Wal-Mart where we ask more in certain contexts than in others before protecting a mark—that dynamic could be replicated on the confusion side of the ledger.
Beebe: on quantum: how does it relate to the idea of materiality? Are we asking essentially the same question, about impact on sales? More generally, does this shift to cost-benefit analysis and multipliers (likelihood x amount of harm) leave behind the nonconsequentialist reasons for granting injunctions and would economists dominate these inquiries? Note that people encountering the field for the first time think that 50% confusion sounds about right, perhaps because of the people not confused—have we defined deviancy down?
Strength is a normative consideration—students always point out that Coca-Cola is so strong that they notice deviations. The orthodoxy weighs strength in plaintiffs’ favor as a normative matter.
Heymann: the test doesn’t address quantum except insofar as we submit a survey. The factors aren’t sensitive to survey evidence. If we care about quantum, surveys should be required. How would you use other factors to assess the quantum, other than by the court eyeballing it? Specialized market might make a difference. (We do see courts saying “this factor weakly favors the plaintiff” and similar things.)
Negligence in tort: reasonability is the general question, and we use certain tools to address that question. We can do cost benefit analysis, and negligence per se where we presume the act is wrongful—possible analogies/models for holistic question that is not a multifactor test but does group classes of cases.
RT: quantum: like Heymann, I ask, how often are there surveys? I thought Beebe’s work showed they were relatively uncommon. Not as if there was one in Network Automation, for example. Would first have to strengthen the requirement of a survey in the average case or in some set of cases for this intervention to be helpful: could do that by, again, presuming against plaintiff in a noncompeting goods case where the plaintiff doesn’t submit a survey. We do that with delay or with a really rich plaintiff and should do it more.
McKenna: could require “overwhelming” evidence of confusion in IIC. Standards could make clearer that a lot of evidence is required in some cases.
Another explanation for courts responding to low levels is that courts are concerned that low levels of confusion will get worse over time. Good empirical evidence that consumers adapt and that confusion will lessen over time, except perhaps for counterfeits, so we should be talking about that too.
McGeveran: quantum shifts more from measuring confusion on its own to measuring confusion with harm. Note that defendant will need an expert to fight the survey, so that is likely to increase the cost and complexity of litigation generally, which makes him nervous.
Bone: quantum is important but maybe not quite right. He wants to focus on the probability of inaccurate information. We should be willing to say there’s a lot of confusion but it’s not confusion that matters.
Bently: in Germany, surveys are seen to be cheap and often required by the court. UK experience: don’t like the idea of making the whole thing more empirical; don’t worry so much about getting it wrong and think more about doing it quick and cheap. Confusion might lead to satisfaction: a schnapps drink called Vodkat—people liked it even if they’d initially thought it was a bottle of vodka because it was cheaper and had half the alcohol. Post-sale satisfaction should be weighed against post-sale confusion. (He’s kind of joking.) Take more account of attempts to reduce confusion such as disclaimers and alternate packaging—tendency in Europe to look only at the registered mark and ignore the other signals consumers are getting. Moving away from the empirical, and accepting pro-plaintiff norms like intent, then should also accept pro-defendant norms such as weighing favorably a bona fide attempt to distinguish.
Dogan: systemic costs to TM system that we shouldn’t lose sight of if people can’t rely on TMs. Flexibility on quantum is appealing to the extent we think of it as a spectrum with an eye towards harm rather than a specific percentage—goes to cost benefit analysis. (I’m really surprised nobody’s mentioned KP Permanent, which outright says you need more confusion to find against a defendant in a descriptive fair use case.)
More protection for stronger marks: goes to the question of whether the defendant needs access to plaintiff’s mark. Less about TM’s affirmative goals but more about the court being suspicious of defendant’s justification.
Mid-point summary: Robert Burrell
Are we prepared to make the tough decisions we were talking about yesterday? If not, maybe confusion is malleable enough to get results even without agreement on the basics, by for example requiring 90% confusion for IIC.
Skepticism about surveys: common-law courts in UK, Australia, and even here generally hate surveys, don’t they? If we’re trying to do something to get courts’ attention, telling them they need more surveys will be difficult. Also, courts hate surveys for good reasons: they ask leading questions, use problematic methodologies. Can we find a survey we’re comfortable with?
How, if at all, should the confusion analysis be modified for registered marks? Coming from a registration system, that’s always his first question. Confusion is often used as a limiting doctrine to rein in registered TM rights. In a registration system, you have an abstraction on one side—not concerned with plaintiff’s actual use, but what’s on the register. A market-based test like likely confusion fits poorly when p’s rights are based on an abstraction. Historically, Anglo-Australian TM law features courts using confusion based analysis either through deceptive similarity or TM use, in order to set defendants free who’d otherwise be caught. Modifies the harshness of rules of strict liability. ECJ is at least contemplating the idea of modifying double identity to aim more at the things TM is really concerned about. That leaves him something of a fan of confusion.
Grynberg: Unwillingness to jettison doctrines—IIC makes sense in some circumstances; among the problems with Brookfield was that the billboard metaphor was completely inappropriate. So we should bite the bullet and say sometimes we don’t redress harms because the costs of stopping them outweigh the benefits. Point of sale confusion can deal with most of the situations we care about.
How do we make it harder for plaintiffs in nontraditional cases? Requiring more surveys just increases the costs of bringing the claim and thus deters the bringing of it. But dilution developments show that this doesn’t work—in Visa in the 9th Circuit and “The Other White Meat” TTAB case, there was survey evidence. And the court said “never mind,” because courts like factors and won’t let them go.
Burrell talks about mitigating the harshness of strict liability, but the problem with more adventuresome TM claims is that we lack an initial reason for the harshness. Focusing on confusion is not a limit without focusing courts more on the thing that precedes the confusion (the harm).
Lemley: playing with quantum of confusion and with factors is great, but keep in mind Beebe’s conclusion: two explanatory factors are similarity of marks and defendant’s intent. Can’t prove causation here, but if those really matter, other factors and surveys won’t matter a lot. The manipulable factors: judge’s assessment of defendant’s intent, particularly pernicious; stands in the way of other reforms if judges have carte blanche to figure out whether the d’s a bad guy. Not sure how to minimize that factor, though would like to eliminate it. If we affirmatively said it wasn’t a factor, we’d at least make it more difficult for judges to use it sub rosa.
Litman: With literal falsity and implicit falsity, because surveys are expensive, we find cases where statements are found literally false because that lets the judge enjoin them without surveys. DMCA cases: all controls are access controls rather than rights controls because those cases are easier to win. So we may create categories but we can’t control their use. Better: persuade courts that knowledge of a mark is not probative of intent to deceive. Should be looking for affirmative evidence of intent to deceive. Presumptions of intent from knowledge are awful: if you knew Gallo was a wine, then importing your wine under the name you’ve always used for your wine is deceptive intent—courts will find against bad guys, but it’s useful to fight for an understanding that will persuade courts that not all these people are bad guys and/or that the bad guys here are standing in for good guys in the same circumstances.
McKenna: it feels like a concession to give up and say courts don’t care about doctrine, though it has a lot of explanatory force. Cling to the hope we can affect their behavior? A lot of our discussion is about our confidence that courts just do what they want. Common law does grow up around standards. For example, in practice, 15% confusion is a green light to file your case, and if you don’t have that, you get another survey. That has an effect on the cases that are filed, and a common law of 75% would have a different effect.
RT: Dastar and Wal-Mart have made big differences in outcomes and any reasonable lawyer would advise clients accordingly. We need a Supreme Court opinion that tells lower courts to think differently about competition more broadly. I think materiality could help, actually, by changing the guideposts courts use in the same way as these cases did: a clearly intelligible concept that can focus courts on cost/benefits.
Bone: breaking out moral reasons can help. Steinway and bait and switch cases strike us as particularly morally reprehensible—that may or may not be right, but bootstrapping it into confusion is the source of the mistake. It has consequences for elasticity of confusion in general, but it’s moral judgment that really drives us. Same with intent. We’re talking past one another until we deal with bad intent.
Bently: the tort of deceit was the genesis of passing off; then equity decided that intent was not required where there was likely confusion. Those two instincts have been there from the start, and history provides perhaps a reason to keep intent around.
McKenna: suggests that injunctions could be more regularly limited to making any falsity true (I’ll be my broken record: courts are really concerned about limiting their injunctions in that way in false advertising cases!) Tabari says that in nominative fair use cases the district court can only enjoin in a limited way unless it’s impossible to do less.
McGeveran: a really specific injunction is time-intensive at the front end. This is a problem for the overburdened nonexpert judge.
Thanks so much to our host, Mark Janis/University of Indiana. As always, a great time!