Saturday, February 25, 2023

Fourteenth Trademark Scholars' Roundtable, part 3 (Evidence)

Changes in Trademark Law and Evidentiary Rules

Introduction:   Jake Linford

Before courts admitted surveys routinely, they were concerned about hearsay. Sometimes rejecting surveys seems like judicial notice—“cola” as generic; the court doesn’t want to hear contrary survey evidence. Some objections go to the weight of the evidence.

Instead of surveys, can we look at things like Google results or large text databases reflective of use in a particular community? Comicon case: corpus linguists filed a brief trying to establish how comicon was used prior to the first San Diego Comicon; court didn’t find that persuasive but shows possibilities of use. You can get a sense for what the use of the putative mark was like at first use, if your data are rich enough. If survey evidence isn’t at the right time (Coach’s claim of fame in the Coach case), corpus evidence might be able to fix that problem. Skeptical that it could help w/confusion though.

Brain scans for consumers? New article proposes judging similarity—he’s very skeptical both of cost and whether it tells us anything [we don’t already know]. What about ChatGPT? Skeptical of that too.

If courts rely too much on dictionaries, the corpus evidence might create a wedge b/t judge and dictionary.

Barton Beebe: How to integrate this into a larger rethinking of competition? At the origin (in technical TM times), confusion wasn’t the harm sought to be addressed; confusion was the evidence of the harm sought to be addressed, which was trespass to property rights. What would it mean if we said TM law is primarily concerned with promoting competition: what would a survey look like for infringement? [Unsurprisingly, I say it would have to consider materiality.] Would secondary meaning surveys change? Maybe not. But would consumer uncertainty play a role in our understanding of how preventing confusion promotes competition?

Earliest surveys—1921 Coca-Cola v. Chera-Cola, used as a matter of course a Q including what we now call a Likert scale, measuring level of certainty. So did 1923 survey. Then we lost that, especially w/ face to face interviews. Surveys became trinary: yes/no/unsure—pushing people to a yes/no answer. Most people are not really sure! These surveys are now designed to suppress that uncertainty and not let it appear in the courtroom b/c we just want numbers.

Imagine 20% confusion in a survey, but everyone was only somewhat sure, while 80% are not confused at all and are absolutely certain. If you present that nuanced distribution, maybe 20% isn’t enough b/c of the large number of correct consumers. We urge introduction of that next level of detail to that.

Also, materiality: this is a way to consider materiality in the rubric of uncertainty.

How do we survey for arbitrariness, fancifulness, suggestiveness?

Discussant:      Mark Lemley

Maybe corpus linguistics can help w/things like descriptiveness and nominative fair use, though skeptical about confusion or fame (b/c you need a standard). Corpuses are attractive b/c the alternatives are so bad: Surveys are infinitely manipulable; dictionaries have an opposite goal to that of a statutory interpreter—to cover possible meanings instead of correct one and so they are abused when used in legal settings.

Rogers is a rule of evidence; its goal is to replace the cumbersome, problematic LOC multifactor analysis with a rule that irrebuttably presumes no confusion except in limited circumstances. [At least the 9th Cir. version.]

One possibility: what if we required or at least said a presumptively admissible survey was X—could be drafted into court rules. If we did that right, we could get rid of a lot of fighting back and forth and manipulation and problematic Qs asking about permission. We could standardize minimum percentages, affected by Likert scale. Could also give us an idea about what heightened confusion is—having a standard would allow us to actually heighten the standard in Rogers cases, which generally don’t actually do so.

Statutory presumption of secondary meaning after five years of exclusive use: it says literally that the Director “may” presume this. There’s a nice naïve literalism that would say the PTO can do this but courts have no power to do so: if the PTO hasn’t so presumed, courts are not in a position to do so.

Incontestability: maybe we ought to have reverse incontestability: if I’ve been around for a certain period of time, we presume that I’m not interfering with another TM owner’s rights. Goes beyond laches; would require killing Dawn Donut, which is an additional plus in his view.

We might think about a TM anti-SLAPP provision.

Fromer: Deciding what we’re going to measure + how do we use a new tool—new questions arise with each new type. Corpus linguistics: consider fame. Is fame use of a word in a particular context a sufficient amount, or is it that the word is only used in that context? We’ve never asked that before! How do we know whether the use is shorthand for a class and what that means [what the Google court called synechdoche use]. Sometimes survey formats get standardized by default, so it has to be very thoughtful.

Grynberg: TM right now only has limited concern with mental availability for confusion; courts normally talk about confusion even though TM owners want to control availability of mark and surveys may be measuring that. On Likert scale, a person w/high recognition and low certainty may be reflecting a belief in what “belongs” to the TM owner.

Heymann: efficiency/cost trades off with accuracy in surveys. People are notoriously bad at reporting on their own mental processes and that’s what we’re asking them to do. [especially with “why do you say that?” where they just guess.] In UX, we don’t ask people how they’d navigate a website; we watch them do that. So we could construct a mock situation where we ask people to make choices and derive from that whether they are confused or not. Choices instead of thoughts.

Silbey: companies invest in those already!

Sprigman: could be a Squirt test with context.

Heymann: but actually ask them to perform the selection task, not just report on what they think they’d do. We’d have to think about whether hesitation counts.

RT: [Want to strongly endorse Fromer’s point: Measuring almost always changes what the construct is—you are answering a different Q than the non-surveyed or non-corpus analyzed question; the survey and the corpus themselves answer different questions (what do people say in response to direct Qs, sometimes after training, versus how do they talk when not thinking about it?). Maybe it’s a better Q, but that needs to be identified.

UX point: 1-800-Contacts: 10th Circuit says look at clickthrough rate to find maximum possible confusion—possibly there’s opportunity to focus more on this.

Past work on how people walk around in a fog of confusion: Jacob Jacoby did this work in the 1980s—20% of people misunderstand any given factual claim in advertising! They walk around confused about sponsorship b/c of their priors!

How to survey arbitrariness/TM function: Washingmachine.com in booking.com—maybe along with not sure we also need the option “neither a common name nor a brand name.”

Lemley’s proposed standardization: (1) Burrell’s caution yesterday that they standardized badly, including about permission; (2) Teflon is a pretty standard format, but questions always arise: (a) original Teflon survey used STP as a well known brand name; wouldn’t do that now; (b) take a look at FOOTLONG surveys, both of which were Teflon surveys w/essentially the same Qs but different training and testing examples. 14 point swing in FOOTLONG results—which might even seem reassuring about the general tenor of the answers since even manipulation didn’t change a ton, but it was a lot.

McKenna: we have substantial standardization in Eveready, Teflon, Squirt (garbage, but much less common), but there’s enormous variability introduced by stimulus, control, universe, and those are extremely hard if not impossible to standardize. What has happened in TM is fetishization of confusion for its own sake: confusion is itself the harm. So one goal is to push against that construct: things in the context of survey evidence need to connect to marketplace behavior, and certainty/Likert scales can help with that.

There’s a strain of marketing literature that uses scanner data: tries to measure effect of brand extension on parent brand. And mostly they don’t unless the products are used together; consumers just segment. Marketing studies: what TM thinks of as related products, TM is much more expansive than consumer behavior—things have to be really close together for consumers to care. Mental state is relevant, but it is relevant in conjunction with behavior.

Linford: if we’re right in our empirical work that supposedly tarnishing uses often have a burnishing effect, what are firms doing when they challenge those uses?

Sprigman: don’t assume firms are rational: they think it’s theirs!

Linford: firms do try to figure out what marketing decisions will work [but there is a difference between the expertise and methodology of the person who does that at P&G and that of the person who does that at Gucci—this is an example of high fashion possibly distorting our analysis of other things].

Ivorine tooth-whitening gum & Ivory soap—the court didn’t think that, as Ivory claimed, you could brush teeth with soap; court looked at survey responses and heavily weighted the “I don’t know” responses against Ivory.

If we’re looking for marketplace harm, that is looking for confusion—but in theory under current law we should be looking for the likelihood of marketplace harm—evidence of likely changed behavior. How would we do that?

Sheff: You could have rigorous standards about substitutes and cross-elasticity of demand that could lead you to the proper universe for a given survey. But one reason you think competition is good might be a deep suspicion of private economic power, so anything that adds friction to that is good. In that circumstance, neither universe nor stimuli selection might matter very much; even confusion might not matter very much. We could do things like market share analysis—but TM might not try to prevent copying others’ TMs in general.

Ramsey: standardize by requiring Qs to separate out different kinds of confusion?

McKenna: courts settled on a low percentage b/c this was about an injury to the TM owner, and the Q was what we should care about. Thinking about injury to TM owner as not the only Q could let us consider, as Sprigman argues, the benefit to the other consumers (85% even) who have benefited from the new option and would have to face loss of the information they got from D’s use.

Fromer: things change over time: marketing channels mattered more when the multifactor test was initially propounded; this makes standardization a challenge (what do we do with increasing brand self-parody?).

McGeveran: as antitrust swings back towards power containment and holistically about efficiency in the whole market rather than whether a particular merger is efficient, that could change how you think about TM’s competition goal.  The narrative about market concentration and power and the ways in which tech creates bottlenecks instead of long tails suggest productive TM inquiries: when we say “competition” in TM, it’s no longer clear what we’re talking about. New market entry opportunities? Competition on store shelves like with private labels? How does TM contribute to the shape of the whole market?

Bone: TM has always been about protecting sellers and avoiding fraud on the public; fraud reflects a moral wrong. Our reconfiguration has to take those into account.

Mid-Point Discussants: Eric Goldman

Antitrust is not a model of empirical evaluation at law, but empirical evidence does matter across consumer law—formation of TOS. Does a link to a privacy policy provide effective notice? One court says a reasonably prudent smartphone consumer knows that an underlined blue link is a hyperlink—that’s just made up; no survey evidence or consumer testimony, no literature. Chances are the judges are not themselves reasonable smartphone users. So much of what we talked about today are riffs on broader consumer law cases: what do we know and how do we know it? Emoji evidence: courts are just, again, making it up. A “good” emoji evidence case includes a statement from law enforcement saying “I’ve worked this beat and I know how they talk.” That’s the best evidence they use (and it’s bad!).

Chris Sprigman: need a way to account for consumer heterogeneity, which is almost always what we see under the hood. Different from saying that consumer belief is sovereign/a trump. Vintage Brand case: court asks whether consumer belief about sponsorship/licensing should be taken seriously or should be considered a legal conclusion about licensing. This isn’t wholly empirical.

Stabs at a theory of competition: start w/ preferences. We’ve been agnostic about them for a while, removing moralism from competition. One form of private differentiation is as good as another; we don’t question why consumers want something. But: Producers enjoy more surplus in differentiated markets; consumers enjoy less. The response is that consumer surplus lost in the form of price is gained in the form of differentiated preferences being satisfied more closely. But is that true of a manufactured-by-marketing preference? To say that in a room of economists is to risk being called an idiot, but that doesn’t mean it’s wrong. People’s valuation of attribution depends on the frame—they value it more if it’s an entitlement and less if they have to pay for it. This isn’t super hard, but the response from economists is “who cares?” We set a legal rule, create preferences, then satisfy them. We forget at the end of the day that the preferences only arose from the entitlements created by the legal rule. This is unsatisfying. The neo-Brandeisians say: the market has an organic existence, which is not about people being induced to have preferences, but manifesting them. That’s too innocent; people are convinced into preferences all the time—but a highly engineered preference might not be as important to satisfy. If we were to use that model of competition in TM and designing surveys, we’d look for what would cause consumers to substitute—focus on source primarily, and we’d need some sort of materiality requirement either as an element of claim or a belief about source strong enough to undergird a choice in the market.

Need more coordination about surveys, outside the context of particular cases. Judges and survey experts need to be able to interact in ways that make accepted surveys more credible.

TM surveys should be more incentive-compatible—in © has done incentive-compatible experiments where people buy & sell things—these are more reliable b/c people don’t tell you what they think but show you what they do—their incentives and their behavior are compatible. Correct answers should benefit participants more than incorrect one, which simulates the real world better. Even if it’s getting the right chewing gum, something small is at stake in the real world.

Robert Burrell

Looking behind curtain of preference satisfaction: it’s very difficult to convince courts to do that. One possibility: makers of Nurofen pain range were fined AU$10 million for having misled consumers for selling the same product as different products for, e.g., period pain with huge price differences. But Nurofen still has a massive price premium over generic ibuprofen: is that ok? We need other examples to move the dial.

Double identity as a solution to evidentiary challenges: it is in Art. 13 of TRIPS! He’s a fan of it for goods cases—seems to cause relatively little harm and acts as an escape valve for the property intuition: it defines what the “property” is. It doesn’t work so well for advertising! Advertising can convey all sorts of nonconfusing messages, so we need robust defensive doctrines to prevent undue harm in advertising, but there are ways to design that.

Evidentiary problems exist everywhere, though they vary across jurisdictions. When we look at what’s actually happening, we see courts getting really uncomfortable w/certain evidence but we don’t use that to ask more general questions about the quality of evidence. Particularly UK and Singapore, acquired distinctiveness is a reliance standard for at least some forms of TMs (trade dress)—don’t provide evidence of use, show me that consumers rely on this to make decisions. The move to failure to function and away from acquired distinctiveness might be a similar rule of evidence—we don’t trust the usual proxies. But if we don’t trust them in trade dress, why do we ever trust them?

In Australia, occasional examples in which proxies for secondary meaning are regarded as unreliable—e.g., consumers had to buy a thing, as when everyone had to buy a set-top box to transition from analog to digital. Massive sales aren’t enough to prove consumers care about manufacturer—similarly with hand sanitizers and masks. But we don’t ask similar questions about shoes, soap, rice, even though those are staples in the modern world.

Evidence of actual confusion: sometimes ignored b/c wrong sort of person is giving answer or answer is so manifestly wrong. Anyone who thought Nike would make toilet cleaner had an extreme and fanciful reaction, according to SCt of Australia. Anyone who thought McDonald’s would make wine was operating under an erroneous assumption.

Maybe consumers aren’t good at explaining their responses not b/c of problems in articulation but b/c that’s not how they make decisions! Might be worth looking comparatively—he previously thought German TM law loved surveys, and that’s true for distinctiveness in general, but not in likely confusion cases where they are verboten. Lack of trust plus commitment to normative view of likely confusion. But the effect (answering Lemley Q) was to broaden rights.

What about marketing experts? Do they have specialized knowledge, or are they charlatans? Australia has debated that as well. Ongoing question of admissibility of Wayback Machine evidence. Every now & then, more in the UK than Australia, courts just assert things about how people shop. And damages are often just asserted. At quotidian level, there’s just a plague of fake evidence—e.g., invoices. In light of that, a narrow property rule makes more sense; otherwise you should have to show real harm.

Silbey: to demand TM have a theory of competition is to displace the consumer as sovereign in terms of what evidence matters. We’re not measuring systems; these are private rights of action.

McKenna: but we structure private rights of action towards systematic goals all the time—what counts as secondary meaning, w/in TM.

Silbey: if it’s not a market for lemons story, it’s a consumer sovereignty story. All these tests arise out of and are reified at a time when the story about what’s optimal in a market is a view of advertising as beneficial/creating the best kind of consumer market.

Sheff: evidentiary rules depend on the theoretical structure. Is a deceptive used car dealer competing fairly? Why does it matter that he be honest? The problem is the seller has information the buyer doesn’t—that’s a form of power that we think ought not to exist in consumer markets. Why do we think that? It’s not b/c it’s inefficient to allow asymmetric info to be used, but b/c we have a thicker theory of autonomy underlying market structure. [McGeveran: or both!] Theories that are quite different can converge on certain aspects of regulatory structure.

But the interesting thing is how we choose when they diverge. If the problem of used cars is that falsities will cause markets to fail, then some critics said to Akerlof that if he was right there should be no market so he must not be right; the response is “this market is worse than it could be.” That’s a comparative claim. We could take care of the market for lemons if we forbade all ads and required plain packaging. [I don’t think that’s true.] Why would that be worse than the market we have now? We have some substantive notions of freedom and autonomy, including in commercial behavior, that shape our idea of the appropriate comparator market and what type of regulatory regimes are preferable which then informs what types of evidence count.

Lemley: There’s a perfectly good information story: bad information makes markets less efficient and decreases purchases. You don’t need a new theory of TM.

Sheff: by the same token, there’s an efficiency story about branding; if we only care about efficiency we’d have a lot more regulation of persuasive advertising than we do.

RT: (1) Anti-ESG movement/anti-DEI movement—think not just about progressivism but about challenges to preference formation from the right. What interventions into the market are justified because preferences are currently distorted? Who is distorting preferences or at least affecting formation of preferences in way they have a vested right to do? If that’s the wrong framing, why is it the wrong framing?  

(2) In response to FTC actions against “results not typical,” industry mounted defense of “results not typical” for weight loss products: consumers are buying hope—even small chance might be material depending on circumstances, which pushes back on the Likert scale?

(3) double identity and advertising: note that in the US comparisons for house brands are often made on the bottle or package!

(4) False advertising has similar theories to those discussed by Burrell in the US: misleading v. misunderstood; puffery and no reasonable consumer would believe that e.g. Froot Loops have fruit in them.  Also there are now two cases accepting the Nurofen-style deception theory in the US.

(5) periodic reminder that registration is not actually consumer focused. If we thought of the US as at least half registration based we would have to say that it’s at least half about competition. [Lemley says that registration might not be about competition either, but it’s at least closer.]

McKenna: the choice is not between satisfying preferences and not, but which preferences we will have and who will satisfy them. There is no neutral position. Economist move: it doesn’t matter where the preference comes from—that is actually a sneaky way of prioritizing certain preferences, since different preferences would have emerged under different rules. Pretending there aren’t rules around who got to shape those preferences is itself deceptive.

Dogan: economists are also rethinking things. What do we do about the fact that medical professionals buy generic drugs but uninformed consumers, often poor/resource-constrained, buy the brand names. So their behavior probably doesn’t match their preferences in some other world—but with what do we replace revealed preferences?

We also need to understand what’s happening within the 85% of unconfused consumers—it may not be a loss to them; it may mean nothing to them.

Gangjee: Registration system is focused not really on consumers or on competition—by and large it’s a different project [I describe it as businesses ordering their own relations]. There is a morality for that. Historically, there was a moral account. If our drive is to more normative account, it’s a hard thing to get right. First UK TM act: no registration for word-only marks at all—totally normative belief that words had multiple uses. Retreat from normative system over time. It was not competition, but availability to producers of words to use to speak—very interesting different vision, replaced by distinctiveness over time.  


No comments: