The Wisdom of Crowds Redux

Posted by David King on Wednesday, November 2nd, 2011

Through the years, I’ve occassionally been confronted by someone who was skeptical about the ability of statistical analysis to estimate the likelihood of human actions: for example, the probability that a customer will respond to a promotion or purchase within a category. My stock answer usually has been something along the lines of that while the statistical models would be imperfect, they could manage the complex interactions between many variables with greater precision than the human mind.

But technological advances are demonstrating that my pat answer might be less safe than it once was, as researchers figure out ways to combine the best of machine and human computation.

A recent and striking example was the decoding of an enzyme in HIV-related viruses, one that lets them reproduce. It had been a problem that had eluded traditional computing for almost three decades, because proteins can be structured in so many ways that finding the optimum configuration can surpass the combinatorial capabilities of even powerful computing platforms. The solution came in the form of a program called Foldit, developed by researchers at the University of Washington, which creates a game with simple rules that allows users to turn and flip a 3D model of the enzyme. Using the software, a group of gamers were able to find a solution within three weeks. Researchers were so impressed that they even shared credit with some of the gamers in the published research.

Prediction markets have been another area of active development for over decade. Such schemes set up a market to predict particular outcomes, such as the movements of stock indices, performance of motion pictures, or professional sporting scores. Participants buy and sell trade virtual stocks that represent different outcomes, usually with a play currency. The underlying principle has been around for hundreds or thousands of year in the form of wagering on horse races, but today’s technology enables the establishment of markets around many problems. As a result predictive markets have been formed in many areas, and several companies have developed applications to aid in the design and management of virtual markets. Consensus Point is one company that is relatively well known in this field.

[BTW: for an anecdotal example of the power of prediction markets, readers may want to revisit this article from October 13, discussing the relative strength of Herman Cain's candidacy in the polls versus what prediction markets were showing. The author, David Rothschild, attributed the markets' lower rating of Cain to the concern "about what Republican voters will learn" about the relatively unknown Cain. Two weeks later (as I write this), that analysis seems to have been accurate.]

Crowdsourcing is another area in which human knowledge can interact with computing to analyze complex problems. Crowdsourcing has been used in marketing and related areas for a number of years to help with such activities as product design. For example, Threadless.com famously uses customer input and voting to design t-shirts, while television shows, such as American Idol, have made viewer voting a central part of their appeal.

But there are more advanced applications for crowdsourcing, as well. One interesting project is the Aggregative Contingent Estimation (ACE) website, funded by the Intelligence Advanced Research Projects Activity. Launched on July 15, 2011, ForecastingAce examines various predictive problems by soliciting users to make estimates via surveys. Rather than simply counting votes or averaging estimates, ACE will attempt to apply weighting algorithms to arrive at more accurate predictions.

Another start-up in this area is Crowdcast, which styles itself as the “leader in Enterprise Collective Intelligence.” Like ForecastingAce, Crowdcast’s early funding also came from IARPA and it similarly is experimenting with methods that allow information to be extracted from crowds efficiently and with algorithms that then turn the resulting information into intelligence. In the case of Crowdcast, the idea is to harness such knowledge to help solve everyday business problems, such as revenue forecasting and risk assessment.

Both of these business areas have well-established methodologies and a range of mature forecasting technologies, yet are subject to errors in prediction when there are emerging changes in underlying conditions (will there be another recession or not), a lack of information dissemination, or undocumented, but important, conditions (e.g. are a large number of loans in a portfolio potentially fraudulent). The knowledge that a group of people might have or the intuition they can bring to bear have the potential to bridge the gap between formal methods and human hunches.

In marketing, there are any number of useful applications. For example, one of the most difficult items to forecast is the sales of a new product. Again, there are well-established methodologies in this field, but my guess would be that asking a wide cross-section of employees to estimate sales would provide at the very least a useful input and perhaps even a more accurate prediction than traditional methods.

Another potential application would be to have an additional layer over other predictive or forecasting tools, a sort of governor that would tell us when to trust a particular model or not. For example, we know that if we have an incremental response model that was built during a period of “normal” buying behavior, it may become less useful if conditions change. If demand suddenly increases, than the model has less room to identify those customers that need stimulus in order to respond. [Think of a model that predicts which people in a crowd are most likely to head for the exit. What happens to that model's predictions, if someone yells "fire!"?] Similarly, if underlying demand is strengthening, then perhaps I should rely less on the model as a targeting tool. Conversely, if demand is shrinking, the model will still likely perform well, but it may identify so few customers that my marketing program may no longer be viable. By using human-generated forecasts of demands, I could be more proactive, rather than having to wait for actual data to come in and reveal errors.

I’m certain we’ll be seeing more innovative applications as we learn how to tap into “the wisdom of crowds,” as James Surowiecki wrote in his 2004 book of the same name. In the meantime, I will temper my reply to skeptics.

N.B. We’ve certainly had statistical tools that have sought to simulate human thinking. Neural networks in their many forms are perhaps the best known and widely applied, but Bayes’ theorem (and the resulting methodologies that have sprung from it) seeks to account for and use the uncertainty inherent in models, much in the way that humans must deal with uncertainty. What makes the technologies I discuss in this article different is that rather than merely simulate human thought patterns, they enlist humans as part of the system.

Leave a Reply