American Association of Law Libraries Annual Meeting, Keynote

Jonathan Zittrain, Harvard

The future of the library (and how to stop it?). His image of the library: a fortress, protecting books against people who might mess them up. Today: a fallout shelter, a place you go to protect stuff against disaster—like emergency rations. Collections become archives. What percentage of a collection sees action over the course of a year? 5% says an audience member. 95% is just in case—you could have a Long Tail room.

But books may not remain the primary medium, though we still have to care for them. More and more reading rooms look like computer clusters, and in many libraries (not necessarily law), they’re redefining themselves as a place to go to get on the internet, which resonates with access to information. People are more online naturally anyway—students have their own laptops. He worries that the library is becoming a piggy bank: a source of income for publishers who previously had to tolerate the first sale doctrine. The AALL site has a wonderful little monograph about first sale.

A PC is a generative device: you can put programs of your choice on it. That’s not as true in the tech environment we’re moving towards. First, move to cloud computing: this practice actually has an early roost in the library, as more and more publishers make their content available in a client/server relationship and the library does not have its own copies. Google Book Search: doesn’t allow you to copy the text of a snippet—capacity defined by the vendor, not by you.

Second, a configuration as good as the cloud: the Kindle. In wireless contact with the Mothership Amazon at nearly all times. (I just discovered, incidentally, that if you only have Kindle for the iPhone, you can’t download the files to your computer for backup as you can when you have a physical Kindle. Curses!) Thus, the recent Orwell incident where Amazon withdrew paid-for books from customers’ Kindles (best post title on this: Amazon deletes your books, has always been at war with Eastasia). You haven’t infringed copyright by possessing this copy—it’s not independently infringing to have an infringing copy, though the vendor has infringed the distribution right. Assume the vendor asked you to destroy this infringing book—very few people, especially librarians, would do so. Because the Kindle is tethered, though, it can get rid of the book.

This is only the beginning of the story: defamation, obscenity/porn—there are many ways that a text can become contraband. What if government data is retroactively decreed sensitive? What if the government wants records of who’s looked at it? The protections for libraries are enshrined in law only in teeny tiny ways. DMCA §1201’s anticircumvention provisions say you can be sued for lockbreaking, except for a incredibly narrow exception applicable to libraries. Nobody in the audience, of course, had ever used this exception; this is an exception matched only in its ridiculousness by the performance rights exception in §110 for playing music at an annual horticultural fair.

So how do we value these items which are rarely used, kept for preservation, but are now being slipped into digital containers where library control is slipping? Turn to “library” as a verb. Library and librarians as sources of knowledge for other people: ask a librarian. But why ask a human when you can Ask Jeeves?

Huge progress in the past ten years: natural-language searching from Westlaw. But a little knowledge is a dangerous thing. People skip the Westlaw training and just put in their question. Creates a weird tension with patrons who walk in and start searching, without knowing anything about search strategy. This tension is only going to get more acute. So much work can be done by disembodied distributed human minds, whether highly compensated, like Innocentive, a company that looks for scientific solutions, or not. Zittrain discussed a distributed “call center” company that has people answer calls from home; Amazon’s Mechanical Turk—tiny rewards, but seductive. Reference can be disaggregated into little atoms. You’re solving a larger problem, but will you know the context or be just a cog in the machine? Spinbox: takes voicemail and turns it into text. Turns out that they simply had humans in an overseas call center transcribe the message.

What about library as adjective? Something that makes a thing more or less a library. “Core purpose”: the essence of an institution. What’s the core purpose of .edu? Protecting IP? Some universities think so. U Texas general counsel in 2001 suggested language for a professor to use at the outset of a class to control students’ use of notes and information learned in the class: “My lectures are protected by state common law and federal copyright law …” Efforts to influence kids in middle school to be favorable to IP protection—one recommendation is to have kids use © on their own coursework. Harvard says: students who sell lecture notes may be required to withdraw from the school. Is this what we want our educational instutions to be?

So, what is the core purpose of libraries? AALL has a mission/vision statement. Key terms: “central to society, fair and equitable, authentic, educate.” How well can these values stand in a new technological zone? What can we farm out to the world at large, to Yahoo! Answers etc.? Maybe not so much! When these are commercialized, there are obvious problems: people on Amazon’s Mechanical Turk are paid to write highly positive reviews for products on Amazon—not authentic and not fair. Spammers reward people for solving CAPTCHAs with porn. If there were a Nobel prize for evil, this would be a strong runner-up in the genius category.

The internet was built on noncommercial principles—sharing at its most basic. Every packet is its own adventure! It finds its way by sharing information. But that makes it vulnerable: one Pakistani ISP shut down YouTube by trying to censor it in Pakistan but doing it badly. This hijacking was corrected by voluntary organizations who gave instructions about reconfiguring the necessary routers. Bad news: your house is on fire and there’s no fire department. Good news: people will appear from nowhere, put out the fire, and disappear without payment or praise. This is a weird configuration that keeps the internet running. The Batsignal goes up and a nonofficial source comes and fixes your problem.

Wikipedia: an idea so profoundly inconceivable that even Jimmy Wales never had it—he wanted to write checks to smart people who’d write reference articles. Wiki: designed to be a place for editing, suggestions, preparation of articles for later expert review. But the wiki was the thing that worked. Now, wikis have their problems—but there are more people who visit the problem page and deal with reported problems than there are more reported problems. Wikipedia is 30 minutes away from disaster, but there are people always on duty against that. But what if there were a particularly compelling Star Trek convention one weekend? Do they leave someone behind to revert vandalism?

What this means: Wikipedia editors are responsible for things like protecting the real name of the Star Wars Kid, based on the consensus that his real name shouldn’t be in the article. Here’s a question: assume you received a link to the Star Wars Kid video, and thought it was funny enough to forward it to your friends. But then you get an authenticated message from the Star Wars Kid asking you to avoid further distribution because the video humiliates him. If your anonymity is protected, will you honor this request? Zittrain thinks that most people will limit distribution. What if Dick Cheney asked you not to disseminate a document because it threatens national security? You make an ethical decision on the merits, not based simply on the fact of the request.

Distributed monitoring of things like censorship: there is an untapped desire to be helpful, to be part of something. Hitchhiking may be dead, but the Craigslist rideshare board is thriving. Is the theory “killers don’t plan ahead”? Unlikely. “Hitchhiker” has unpleasant connotations, but now we have a new context. We’ll try to develop systems to make it keep working even if people try to manipulate it. one guy’s idea of pairing up people going far away who’d like to sleep on a stranger’s couch for free with people far away who’d like strangers to sleep on their couches for free. Over a million happy couchsurfers so far. A system of volunteer ambassadors.

The Wayback Machine: who hired Brewster Kahle to do this? Nobody! He hired himself, based on an idea: the internet is changing all the time, and someone ought to be taking snapshots. Suppose a library had wanted to do this. It would have gone to the General Counsel, who most likely would have flipped out over the copyright implications. This is a directly accessible corpus, opt-out instead of opt-in. Survives by virtue of how compelling a resource it is. Plus there is opt-out instantly; Zittrain believes that the archive doesn’t delete data but does make it inaccessible at request. Project Guttenberg: guy just starts organizing volunteers to type in public books.

The PACER petition: reasonable request for increased accessibility, produced by online organization. People should go sign this petition! If we don’t do it, dot-com will. Definitive information about a book shouldn’t be run by one single company. There is power of pooling research ability, and even in monopsony: the power of consumers to say that they are organizing and have agreed not to buy products with too much price discrimination or too many limits on what patrons can do.

Google Books: $100 million to scan. That’s not that much, compared to the bailout, right? What uses can be made of the “gold copies” that go back to contributing libraries? The settlement might shake out to allow libraries to make a public resource out of the library copies to counterbalance Google’s power.

Social component: most important help he’s gotten from a library has been face-to-face, someone who knows him and can call him out when he’s gotten something wrong. That relationship is most at risk when we turn our libraries into pneumatic tubes—queries go in, answers come out.

Problems: resource constraints. Risk aversion: worsened by the sense of stewardship—don’t want to risk preservation goals. The perfect should not be the enemy of the good. People are carefully working on perfecting metadata while some teenager invents a thing called and everyone else starts using it.

Q: In China, other values are being pushed in providing services. How do we think about that?

A: Wolfram Alpha: a positivist theory of knowledge; only wants data from curated sources. But what happens when someone asks “what’s the name of that island?” and the answer China wants is X Province while the answer everyone else gives is Taiwan? Are there two answers the search engine should provide, one in China and one elsewhere?

Q: Zittrain has said that the best stuff happens without profit or praise. Are those bad things?

A: No. Profit finds its own energy; we don’t have to worry about that other than to keep it in balance so it doesn’t crowd out or get rid of fairness and other values. He’s also interested in ways of giving praise/attribution wars. There may be plenty who don’t care about recognition, but he doesn’t consider recognition toxic. (Hmm, stated like that I’m now worried about recognition!)


