Moderator: Pamela Samuelson, Berkeley Law School
From Notice-and-Takedown to Content Licensing and Filtering:
How the Absence of
UGC Monetization Rules Impacts Fundamental Rights
João Quintais, University of Amsterdam with Martin
Senftleben, University of Amsterdam
Human rights impact of the new rules. When we say notice and
takedown is no longer extant in EU, that’s not entirely true—DSA has it. We
look at safe harbors not from an industry perspective—allowing them to develop
services regardless of what users upload—but from a user human rights
perspective—we allow users to upload whatever they want and correct it later.
DSA has a © interface that says insofar as there’s specific © legislation it is
lex specialis and prevails over DSA. Thus, the safe harbor comes to an end
w/r/t © content. Art. 17 instead controls for OCCSP systems. Licensing and filtering
replaces safe harbor.
That has a human rights impact. Risk of outsourcing human
rights obligation to private entities, and conceding these mechanisms by
putting responsibility for correcting excesses in hands of users. Regulator is
hiding behind other parties.
W/r/t platforms like YouTube, requirement of not allowing upload
w/o licensing is a major intrusion on user freedom, leading to public demonstrations
against upload filters that can’t distinguish b/t piracy and parody. DSA says
you have to take care of human rights in proportionate manner in
moderation/filtering. But regulator doesn’t do the job itself. Guardians of UGC
galaxy are the platforms themselves. Regulator hides behind industry and says
we’ve solved the issue.
Reliance on industry is core of Art. 17—cooperation b/t
creative industry and platform industry. One industry sends info on works that
should be filtered, and the platforms have to include that info in their
filtering systems. But regulator says that this cooperation shouldn’t result in
preventing noninfringing content—but how realistic is this? They aren’t at the
table maximizing freedom of communication; they are at the table maximizing
profit. That’s ok as a goal but these actors are not intrinsically motivated to
do the job assigned to them.
Concealment strategy: Empirical studies show that users are
not very likely to bring complaints; system excesses will stay under the radar.
Legislator suggests a best practice roundtable among big players. ECJ has
spoken in Poland decision; what the court did confirmed that this
outsourcing/concealment strategy was ok, rubberstamping the approach. The Court
says the filtering systems have to distinguish b/t lawful and unlawful content,
but that’s an assumption and not a very realistic one. The incentive to filter
more than necessary is higher than the incentive to maximize freedom of
expression. The court says the filtering is only set in motion if rightsholders
send a notification, but again the incentives are to send lots of notices.
Audit system could be a solution, involvement of member
states could be a solution, but we have to discuss the issues openly.
Monetization is an area where the issues are particularly
bad. YouTube, Instagram, TikTok are the subjects: an action of obtaining
monetary benefit/revenue from UGC provided by the user. There are restrictions
in DSA on demonetization. Most discussion has been about upload filters, but YT’s
© transparency reports show that actually most of the action is not blocking
via filtering—it’s mostly monetization through Content ID. 98.9% of claims are
through Content ID, and 90% are monetized.
Art. 17 doesn’t say anything specific about monetization; it’s
all about filtering and staydown. Provisions for fair remuneration for authors
are very difficult to apply. Lots of actions are not covered by the regulation,
especially visibility measures and monetization. DSA has two clear provisions
covering statement of reasons for in platform complaint resolution/ADR, but most
of what users can do is ex post.
Big platform practices change regularly. Content ID and
Rights Manager (Meta) are the big hitters; you can get other third party
solutions that intermediate b/t rightsholders and platforms like Audible Magic
and Pex. Legibility of access to monetization in Content ID and Rights Manager:
available only to large firms. If you take Poland decision seriously, it should
only be filtered for manifestly infringing content, but these systems are
designed to allow parameters below the legal threshold.
On the monetization side, it’s billed as ex post licensing;
as a rule this is not available to small UGC creator, though there are
exceptions. There is a lack of transparency; left to private ordering.
Human rights deficits: Misappropriation of freedom of
expression spaces & encroachment on smaller creators’ fundamental right to ©.
If a use is permissible and does not require permission, larger rightsholders
can nonetheless de facto appropriating and being given rights over content for
which they have no legal claim. Creates the illusion that there’s no expressive
harm b/c the content stays up, but there’s unjustified commercial exploitation.
UGC creator is often a © owner, with protection for that as a work. Only they
should logically be allowed to monetize. Interferes w/ UGC creator’s
fundamental right, but this is not a default option on the platform for
historical reasons, leaving them only to ex post remedies, which are weak
sauce.
Recommendations: bring these problems to light. Audit
reports should focus on these questions. Human rights safeguard clause: must
take care to protect parody, pastiche—if they pass the filter they should not
lead to monetization. Confining filtering to manifestly infringing UGC also
helps. German solution: collective licensing schemes w/a nonwaivable remuneration
right for UGC creators, not just to industry players. Inclusion of creative
users in a redesign of a more balanced monetization system.
An Economic Model of Intermediary Liability
James Grimmelmann, Cornell Law School; Cornell Tech
Economic claims about effects of liability regimes are
common. Chilling effects; whether platforms do or don’t have sufficient
incentives to police content; claims that 230/DMCA don’t’ adequately balance free
expression with safety concerns; etc. These are statements about incentives,
but primarily policy arguments, not empirically tested. There are some
empirical studies, but our project is to create an economic model to provide a
common framework to compare arguments. This can help create predictions and visualize
the effects of different rules. Mathematical model forces us to make explicit
the assumptions on which our views rest.
Start w/question of strict liability v. blanket immunity;
look at possible regimes; map out core elements of 512, DSA, and 230. Not going
to talk about policy responses to overmoderation/must-carry obligations or
content payment obligations.
Basic model: users submit discrete items of content. Each
item is either harmful or harmless. Platform chooses to host or take down. If
hosted, platform makes some money; society gets some benefits; if the content
is harmful, third party victims suffer harm. Key: platform does not know
w/certainty whether content is harmful, only the probability that it is. These are
immensely complicated functions but we will see what simplification can do.
[What happens if harmful content is profitable/more profitable than nonharmful
content? Thinking about FB’s claim that content that is close to the line gets
the most engagement.]
A rational moderator will set a threshold. Incorporates judgments
about acceptable risk of harm in light of likelihood of being bad v benefits of
content to platform and society.
The optimal level of harmful content is not zero: false
positives and false negatives trade off. We tolerate bad content b/c it is not
sufficiently distinguishable from the good, valuable content. Users who post
and victims harmed may have lots of info about specific pieces—they know which
allegation of bribery is true and which is not—but platforms and regulators are
in positions of much greater uncertainty. Platform can’t pay investigators to
figure out who really took bribes.
Under immunity, platform will host content until it’s individually
unprofitable to do so (spam, junk, other waste of resources). This might result
in undermoderation—platform’s individual benefit is costly for society. But it’s
also possible that platform might overmoderate if platforms take stuff that’s
not profitable down but was net beneficial for society. There is no way to know
the answer in the abstract; depends on specifics of content at issue.
Focusing on undermoderation case: one common law and econ response
is strict liability. This is always less than the benefit to society—it will
always overmoderate content that would be unprofitable under strict liability
but would be beneficial to society. This is Felix Wu’s theory of collateral
censorship: good content has external benefits and is not distinguishable from bad
from outside. If those two things are true, strict liability won’t work b/c
platform only internalizes all harm, but not all benefit.
Other possible liability regimes: actual knowledge, when no
investigation is required—this allows costless distinctions between good and
bad. But does actual knowledge really mean actual knowledge or is it a
shorthand for probabilistic knowledge of some kind? Intuition is that notices
lower cost of investigation. Fly in the ointment: notices are signals conveying
information. But the signal need not be true. When investigations cost money,
many victims will send notice w/o full investigation. Turns out notices
collapse into strict liability—victims send notices for everything. Must be
costly to send false notices; 512(f) could have done this but courts gutted it.
DSA does better job with some teeth to “don’t send too many false notices.” Trusted
flagger system is another way to deal with it.
Negligence is another regime—more likely than not to be
harmful. Red flag notice under DMCA. Gives us a threshold for platform action.
Conditional immunity is different: based on total harm caused by the platform—if
too much, platform is liable for everything. This is how the repeat infringer
provisions of 512 work: if you fail to comply, you lose your safe harbor
entirely. These can be subtly different. Regulator has to set threshold
correctly: a total harm requirement requires more knowledge of the shape of the
curve b/c that affects the total harm; the discontinuity at the edge is also
different.
512 is a mix: it has a negligence provision; a financial
benefit provision—if it makes high profits from highly likely to be bad
content; repeat infringers. DSA has both actual knowledge and negligence
regimes. Art. 23 requires suspension of users who provide manifestly illegal
content, but only as a freestanding obligation—they don’t lose the safe harbor
for doing it insufficiently; they simply pay a fine. 230 is immunity across the
board, but every possible regime has been proposed. It is not always clear that
authors know they propose different things than each other—negligence and
conditional immunity look very similar if you aren’t paying attention to
details.
Although this is simplified, it makes effects of liability
rules very obvious. Content moderation is all about threshold setting.
Interventions Rebecca
Tushnet, Harvard Law School
Three sizes fit some: Why Content Regulation Needs Test
Suites
Despite the tiers of regulation in the DSA, and very much in
Art. 17, it’s evident that the broad contours of the new rules were written
with insufficient attention to variation, using YouTube and Facebook as
shorthand for “the internet” in full. I will discuss three examples of how that
is likely to be bad for a thriving online ecosystem and offer a
suggestion.
The first issue is the smallest but reveals the underlying
complexity of the problems of regulation. As Martin Husovec has written in The
DSA’s Scope Briefly Explained,
https://ssrn.com/sol3/papers.cfm?abstract_id=4365029,
Placement in some of the tiers is defined by reference to
monthly active users of the service, which explicitly extends beyond registered
users to recipients who have “engaged” with an online platform “by either
requesting the online platform to host information or being exposed to
information hosted by the online platform and disseminated through its online
interface.” Art. 3(p). While Recital 77 clarifies that multi-device use by the
same person should not count as multiple users, that leaves many other
measurement questions unsettled, and Husovec concludes that “The use of proxies
(e.g., the average number of devices per person) to calculate the final number
of unique users is thus unavoidable. Whatever the final number, it always
remains to be only a better or worse approximation of the real user base.” And
yet, as he writes, “Article 24(2) demands a number.” This obligation applies to
every service because it determines which bucket, including the small and micro
enterprise bucket, a service falls into.
This demand is itself based on assumptions about how online
services monitor their users that are simply not uniformly true, especially in
the nonprofit or public interest sector. It seems evident—though not specified
by the law—that a polity that passed the GDPR would not want services to engage
in tracking just to comply with the requirement to generate a number. As
DuckDuckGo pointed out, by design, it doesn’t “track users, create unique
cookies, or have the ability to create a search or browsing history for any
individual.” So, to approximate compliance, it used survey data to generate the
average number of searches conducted by users—despite basic underlying
uncertainties about whether surveys could ever be representative of a service
of this type—and applied it to an estimate of the total number of searches
conducted from the EU. This doesn’t seem like a bad guess, but it’s a pretty
significant amount of guessing.
Likewise, Wikipedia assumed that the average EU visitor used
more than one device, but estimated devices per person based on global values
for 2018, rather than for 2023 or for Europe specifically. Perhaps one reason
Wikipedia overestimated was because it was obviously going to be regulated no
matter what, so the benefits of reporting big numbers outweighed the costs of
doing so, as well as the stated reason that there was “uncertainty regarding
the impact of Internet-connected devices that cannot be used with our projects
(e.g. some IoT devices), or device sharing (e.g. within households or
libraries).” But it reserved the right to use different, less conservative
assumptions in the future. In addition, Wikipedia also noted uncertainty about
what qualified as a “service” or “platform” with respect to what it did—is
English Wikipedia a different service or platform for DSA purposes than Spanish
Wikipedia? That question obviously has profound implications for some services.
And Wikipedia likewise reserved the right to argue that the services should be
treated separately, though it’s still not clear whether that would make a
difference if none of Wikipedia’s projects qualify as micro or small
enterprises.
The nonprofit I work with, the Organization for
Transformative Works (“OTW”) was established in 2007 to protect and defend fans
and fanworks from commercial exploitation and legal challenge. Our members make
and share works commenting on and transforming existing works, adding new
meaning and insights—from reworking a film from the perspective of the
“villain,” to using storytelling to explore racial dynamics in media, to
retelling the story as if a woman, instead of a man, were the hero. The OTW’s
nonprofit, volunteer-operated website hosting transformative, noncommercial
works, the Archive of Our Own, as of late 2022 had over 4.7 million registered
users, hosted over 9.3 million unique works, and received approximately two
billion page views per month—on a budget of well under a million dollars. Like
DuckDuckGo, we don’t collect anything like the kind of information that the DSA
assumes we have at hand, even for registered users (which, again, are not the
appropriate group for counting users for DSA purposes). The DSA is written with
the assumption that platforms will be extensively tracking users; if that isn’t
true, because a service isn’t trying to monetize them or incentivize them to
stay on the site, it’s not clear what regulatory purpose is served by imposing
many DSA obligations on that site. The dynamics that led to the bad behavior
targeted by the DSA can generally be traced to the profit motive and to
particular choices about how to monetize engagement. Although DuckDuckGo does
try to make money, it doesn’t do so in the kinds of ways that make platforms
seem different from ordinary publishers. Likewises, as a nonprofit, the Archive
of Our Own doesn’t try to make itself sticky for users or advertisers even
though it has registered accounts.
Our tracking can tell us how many page views or requests
we're getting a minute and how many of our page views come from which browsers,
since those things can affect site performance. We can also get information on
which sorts of pages or areas of the code see the most use, which we can use to
figure out where to put our energy when optimizing the code/fixing bugs. But we
can’t match that up to internal information about user behavior. We don’t even
track when a logged in account is using the site—we just record the date of
every initial login, and even if we could track average logins per month, a
login can cover many, many visits across months. The users who talk to us
regularly say they use the site multiple times a day; we could divide the
number of visits from the EU by some number in order to gesture at a number of
monthly average users, but that number is only a rough estimate of the proper
order of magnitude. Our struggles are perhaps extreme but they are clearly not
unique in platform metrics, even though counting average users must have
sounded simple to policymakers. Perhaps the drafters didn’t worry too much
because they wanted to impose heavy obligations on almost everyone, but it
seems odd to have important regulatory classes without a reliable way to tell
who’s in which one.
These challenges in even initially sorting platforms into
DSA categories illustrates why regulation often generates more
regulation—Husovec suggests that, “[g]oing forward, the companies should
publish actual numbers, not just statements of being above or below the 45
million user threshold, and also their actual methodology.” But even that, as
Wikipedia and DuckDuckGo’s experiences show, would not necessarily be very
illuminating. And the key question would remain: why is this important? What
are we afraid of DuckDuckGo doing and is it even capable of doing those things
if it doesn’t collect this information? Imaginary metrics lead to imaginary
results—Husovec objects to porn sites saying they have low MAUs, but if you
choose a metric that doesn’t have an actual definition it’s unsurprising that
the results are manipulable.
My second example of one size fits some design draws on the
work of LLM student Philip Schreurs in his paper, Differentiating Due Process
In Content Moderation: Along with requiring hosting services to accompany each
content moderation action affecting individual recipients of the service with
statements of reasons (Art. 17), platforms that aren’t micro or small
enterprises have due process obligations, not just for account suspension or
removal, but for acts that demonetize or downgrade any specific piece of
content.
Article 20 DSA requires online platform service providers to
provide recipients of their services with access to an effective internal
complaint-handling system; although there’s no notification requirement before
acting against high-volume commercial spam, even for high-volume commercial
spam, platforms have to provide redress systems. Platforms’ decisions on
complaints can’t be based solely on automated means.
Article 21 DSA allows users affected by a platform decision
to select any certified out-of-court dispute settlement body to resolve
disputes relating to those decisions. Platforms must bear all the fees charged
by the out-of-court dispute settlement body if the latter decides the dispute
in favor of the user, while the user does not have to reimburse any of the
platforms’ fees or expenses if they lose, unless the user manifestly acted in
bad faith. Nor are there other constraints on bad-faith notification, since
Article 23 prescribes a specific method to address the problem of repeat
offenders who submit manifestly unfounded notices: a temporary suspension after
a prior warning explaining the reasons for the suspension. The platform must provide the notifier with
the possibilities for redress identified in the DSA. Although platforms may
“establish stricter measures in case of manifestly illegal content related to
serious crimes,” they still have to provide these procedural rights.
This means that due process requirements are the same for
removing a one-word comment as for removing a 1 hour video: for removing a
politician’s entire account and for downranking a single post by a private
figure that uses a slur. Schreurs suggests that the process due should instead
be more flexible, depending on the user, violation, remedy, and type of
platform.
The existing inflexibility is a problem because every anti
abuse measure is also a mechanism of abuse. There seem already to be
significant demographic differences in who appeals a moderation decision, and
this opens up the possibility of use of the system to harass other users and
burden platforms, discouraging them from moderating lawful but awful content,
by filing notices and appealing the denial of notices despite the supposed
limits on bad faith. Even with legitimate complaints about removals, there will
be variances in who feels entitled to contest the decision and who can afford
to pay the initial fee and wait to be reimbursed. That will not be universally
or equitably available. The system can easily be weaponized by online
misogynists who already coordinate attempts to get content from sex-positive
feminists removed or demonetized. We’ve already seen someone willing to spend
$44 billion to get the moderation he wants, and although that’s an outlier there
is a spectrum of willingness to use procedural mechanisms including to harass.
One result is that providers’ incentives may well be to cut
back on moderation of lawful but awful content, the expenses of which can be
avoided by not prohibiting it in the terms of service or not identifying
violations, in favor of putatively illegal content. But forcing providers to
focus on decisions about, for example, what claims about politicians are false
and which are merely rhetorical political speech may prove unsatisfactory; the
difficulty of those decisions suggests that increased focus may not help
without a full-on judicial apparatus.
Relatedly, the expansiveness of DSA remedies may water down
their realistic availability in practice—reviewers or dispute resolution
providers may sit in front of computers all day, technically giving human
review to automated violation detection but in practice just agreeing that the
computer found what it found, thus allowing the human to complete thousands of
reviews per day as Propublica has found with respect to human doctor review of
insurance denials at certain US insurance companies.
And, of course, the usual anticompetitive problems of
mandating one size fits all due process are present: full due process for every
moderation decision benefits larger companies and hinders new market entrants.
Such a system may also encourage designs that steer users away from
complaining, like BeReal’s intense focus on selfies or Tiktok’s continuous flow
system that emphasizes showing users more like what they’ve already seen and
liked—if someone is reporting large amounts of content, perhaps they should
just not be shown that kind of content any more. The existing provisions for
excluding services that are only ancillary to some other kind of product—like
comments sections on newspaper sites, for example—are partial at best, since it
will often be unclear what regulators will consider to be merely ancillary. And
the exclusion for ancillary services enhances, rather than limits, the problem
of design incentives: it will be much easier to launch a new Netflix competitor
than a new Facebook competitor as a result.
© specific rules are not unique: subject to same problem of
legislating for YouTube as if YouTube were the internet. Assumes that all
OSSCPs are subject to same risks. But Ravelry—a site focused on the fiber
arts—is not YouTube. Cost benefit analysis is very different for a site that is
for uploading patterns and pictures of knitting projects than for a site that
is not subject-specific. Negotiating with photographers for licensing is very
different than negotiating with the music labels, but the framework assumes
that the licensing bodies will be functioning pretty much the same no matter
what type of work is involved. Sites like the Archive of Our Own receive very
few valid © claims per works uploaded, per time period, per any metric you want
to consider, and so the relative burden of requiring YouTube-like licensing is
both higher and less justified. My understanding is that the framework may be
flexible enough to allow a service to decide that it doesn’t have enough of a
problem with a particular kind of content to require licensing negotiations,
but only if the authorities agree that the service is a “good guy.” And it’s
worth noting, since both Ravelry and the Archive of Our Own are heavily used by
women and nonbinary people, that the concept of a “good guy” is likely both
gendered and racially coded, which makes me worry about its application.
Suggestion: Proportionality is much harder to achieve than
just saying “we are regulating more than Google, and we will make special
provisions for startups.” To an American like me, the claim that the DSA has
lots of checks and balances seems in tension with the claim yesterday that the
DSA looks for good guys and bad guys—a system that works only if you have very
high trust that the definitions of same will be shared.
Regulators who are concerned with targeting specific
behaviors, rather than just decreasing the number of online services, should
make extensive use of test suites. Daphne Keller
of Stanford and Mike
Masnick of Techdirt proposed this two years ago. Because regulators write
with the giant names they know in mind, they tend to assume that all services
have those same features and problems—they just add TikTok to their
consideration set along with Google and Facebook. But Ravelry has very
different problems than Facebook or even Reddit. Wikipedia was big enough to make
it into the DSA discussions, but the other platforms burdened most because they
haven’t built the automated systems that the DSA essentially requires are now
required to do things that Facebook and Google weren’t able to do until they
were much, much bigger.
A few examples of services that many people use but not in
the same way they use Facebook or Google, whose design wasn’t obviously
considered: Zoom, Shopify, Patreon, Reddit, Yelp, Substack, Stack Overflow,
Bumble, Ravelry, Bandcamp, LibraryThing, Archive of Our Own, Etsy.
The more complex the regulation, the more regulatory
interactions need to be managed. Thinking about fifty or so different models,
and considering how and indeed whether they should be part of this regulatory
system, could have substantially improved the DSA. Not all process should be
the same just like not all websites should be the same, unless we want our only
options to be Meta and YouTube.
Q: Another factor: how do we define harm and who defines it—that’s
a key that’s being left out. Someone stealing formula from Walgreens is harmful
but wage theft isn’t perceived as the same harm.
Grimmelmann: Agree entirely. Model takes as a given that regulator
has a definition of harm, and that’s actually hugely significant and contested.
Distribution of harms is also very important—who realizes harm and under what
conditions.
Q: Monetization on YT: for 6-7 years, there’s been
monetization during dispute. If rightsholder claim seems wrong to user, content
stays monetized until dispute is resolved. We might still be concerned over claims
that should never have been made in the first place. YT has a policy about
manual claims made in Content ID. Automatic matching doesn’t make de minimis
claims; YT changed policy for manual claims so they had to designate start and
stop of content experience and that designation had to be at least 10 seconds
long. A rightsholder who believes that 9 seconds is too much can submit a
takedown, but not monetize. Uploaders that sing cover songs have long been able
to share revenue w/© owners.
Quintais: the paper goes into more detail, but it’s not
clear that these policies are ok under new EU law. [Pastiche/parody is the
obvious problem since it tends to last more than 10 seconds.] Skeptical about
the monetization claims from YT; YT says there are almost no counterclaims. If
the system can’t recognize contextual uses, which are the ones that are
required by law to be carried/monetized by the uploader? A lot of monetization
claims are allegedly incorrect and not contested. Main incentive of YT is
pressure from rightsholders w/access to the system further facilitated by Art.
17.
Q: platforms do have incentives to care about fundamental
rights of users. We wouldn’t need a team at YT to evaluate claims if we just
took stuff down every time there was a claim. [You also wouldn’t have a service—your
incentives are to keep some stuff up, to be sure, but user demand creates
a gap as Grimmelmann’s paper suggests.]
Quintais: don’t fundamentally disagree, but Art. 17 leaves
you in a difficult position.
Hughes to Grimmelmann: Why assume that when harm goes up,
societal benefit goes down? Maybe as harm to individual goes up so does
societal benefit (e.g. nude pictures of celebrities).
A: disagrees w/example, but could profitably put a
consideration of that in model.
No comments:
Post a Comment