Sunday, December 09, 2007

Reputation economies, Panel III: Reputational Quality and Information Quality

Moderator: Laura Forlano

Urs Gasser: Quality is very hard to define, especially as we move from harder measures to softer ones (like Cyworld’s kindness, karma, and sexiness). Context makes this even more difficult – Amazon asks whether a review was helpful or not, but we might disagree on helpfulness. Calls for a taxonomy of informational quality, including syntactic (data), semantic (meaning), and pragmatic (effects). Then we can consider the full range of tools for dealing with quality. Possibilities: market-based approaches (pricing, incentives); social norms; platform design (using insights of social signaling research); substantive and procedural roles of law (in guaranteeing privacy or due process).

Information quality conflicts can’t be avoided. They can only be managed. Each regulatory approach has costs. A general limit is the contextual and subjective nature of human information processing and decisionmaking. Reputation is used by humans, which means the output is uncertain! Teens have known extra cognitive biases, but even adults have limits on what they can process, which makes reputation systems limited in what they can achieve.

Ashish Goel: He focuses on protecting social networks against anti-social behavior like spam, badmouthing, quid pro quo reviews, ballot stuffing by sockpuppets, and spurious comments to increase ranking on social sites like Yahoo! Answers. Anonymity and automation increase the scope of the problem. Given the large number of content generators, small merchants, and first-time transactions, reputation is a critical problem. It is computationally intractable to detect collusion via back-scratching, link farms, etc.

Then why do search engines work so well? Because their heuristics are not in the public domain. He found that ranking reversals (where a better result appears after a worse result) create an arbitrage opportunity for users, if users can profit from identifying the reversal. For Amazon, better recommendations mean more purchases, providing a way to attribute revenue to users; for Google, it’s a little harder to translate in ways that Google could reward users for improving rankings, but could be done – and must be done, he argues, because startup search engines are promising to share the wealth.

Ranking mechanism: users would place positive and negative tokens on result entries, with limits on the net weight of tokens place (so you couldn’t just keep giving positives). He has a detailed model that I won’t go into. It seems like an interesting idea, though I wonder if the switching costs for most people will overwhelm the potential rewards. People give up money potentially available to them through coupons, mail-in rebates, and other reward systems all the time; if we’re just talking about pennies – with the potential of negative rewards, even – how do you convince them to use it?

Darko Kirovski: We need to define reputation before we can have it! Example of new system enabled by reputation: person-to-person lending online. Reputation can be quantified: the system has 9.28% average return, with 1-2% closing fee, 0.5-1% annual fee. 2.7% or 5+% default rate (disagreement because the business is early and default rate is likely to go up). At 3% default rate and 1.5% fees, the return is 4.78% in the first year and 5.78% thereafter; the risk can be compared to bank CDs and bonds. There is money to be made in figuring out how to reduce the default rate, using reputation.

How does fraud happen? It’s simple and it’s usually the seller. Generate a new identity, build reputation by commiting sound transactions and fabricating transactions. Offer attractive merchandise at great low prices, then disappear.

Forlano: Can critical reviews on Amazon generate money for Amazon?

Goel: If they improve the recommendation engine. If you rate four books highly and one very low, you are still selling books for Amazon. (This reminds me of the debate in copyright theory over whether the ability to write critical reviews without the copyright owner’s consent actually increases the market for reviewed works. I think this is a just-so story. It wasn’t clear to me what Goel’s model shows about critical reviews on their own, or whether it compares to a model in which only positive reviews are shown.)

Question: How do you deal with the problem of subcommunities in a reward-for-rating system? I shouldn’t get penalized for hating Harry Potter.

Goel: It’s not a commentary on you as a person. The fact is, you might just not be a very good reviewer in the sense that your reviews don’t help Amazon make money.

Question: Will computers ever beat human judgment given the limits on the latter?

Kirovski: A computer can never predict whether fraud is going to happen. But the computer might be able to tell you that, given the fees this guy has paid on eBay so far, if you bid $500 he won’t be able to walk away with a profit (if he’s a fraudster), but if you bid $3000 he could. So a computer could at least tell you that, and then you could decide whether to take a greater risk. The computer could even tell you what the likelihood is, based on past transactions, that this particular transaction is fraudulent.

Mari Kuraishi: Runs GlobalGiving, which collects donations for various projects around the world. There are upstream reputation issues (picking projects) which they have currently under control, with a closed system for picking and verifying projects. But once picked, how do you evaluate them within the system? She needs rating to avoid paradox of choice for donors; she needs the project leaders to build reputations that will be useful to them in their work. But how qualified are people to evaluate project leaders? Does a donor have useful information about the social circumstances of a project?

Current rankings depend about 40% on how well the project leaders report back, and about 60% on success in fundraising. It’s crude, but it’s what they can measure. Donors can also submit reviews of the project leaders’ reports.

There is a low response rate to the reports, in spite of above-industry opens and clickthroughs: 98% of reports were delivered to 1391 donors, 35% were opened, and 18% clicked through – 250 donors saw reports. So there may be implementation difficulties, or it may simply be unclear to donors what their relationship to the project is, which deters them from commenting.

Comments on the site range a lot from insightful to … not. What is feedback for? (1) Communal sharing: other people want for you what you want for yourself, as with Amazon ratings where people want to help you pick the right books. (2) Authority ranking: don’t mess with me. (3) Exchange: if you scratch my back, I’ll scratch yours. GlobalGiving doesn’t fit neatly into these. A donation is a public good, and donors are often separated by vast physical, cultural, even language differences from project leaders. People don’t know what to say because they don’t know why they’re saying it.

Experiments in ranking: use of upstream networking by having project leaders recommend other project leaders; pledging support required before having full rights on the site; downstream use of project leaders’ connections with each other, as with micro-elites.

Vipul Ved Prakash: Discussed difficulties of filtering spam: filters have to move fast, they have to have some self-correction, and they have to collaborate; spammers are collaborating against them, and spammers are geeks who’ve gone over to the dark side, so they are hard to fight. Prakash describes a trust-based reputation system: email recipients who are known to make good decisions about what’s spam.

1 comment:

Anonymous said...

Thanks for blogging about this symposium. Very interesting discussions and panels. Wish I could have been there!