June 30th, 2008
A lot has been written lately on how intelligent search will solve all kinds of problems, most recently in The End of Theory, Chris Anderson of “long tail” fame confuses the abundance of low hanging fruit that “big search” and biotechnologies provide with the ability to really understand and extract meaning, pose and falsify or support hypothesies. Mathew Ingram takes issue with the Wired article in Google and the end of everything and Alistair Croll piles on in Does Big Search change science? emphasizing the familiar scientific refrain: correlation does not necessitate causation.
To be fair to Chris, it seems that he does understand Mathew’s point that correlation is not causation, rather his thesis seems to be that with sufficiently large datasets and powerful computational algorithms, correlation approaches causation. However I side with Mathew and Alistair, I don’t think Chris understands what Google or Rapid gene sequencing bring to scientific analysis, or he has written an excellent satirical article:
Petabytes allow us to say: “Correlation is enough.” We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot.
It sounds like we should be able to just sit back and feed the raw data into a massive cloud computer, grab a few coffees, live a few lifetimes and get some answers (Deep Thought anyone?). As the search technology gets smarter we can all afford to get a lot stupider, as we are no longer required to solve scientific problems.
In actuality Google’s pagerank algorithm(s) and Craig Venter’s DNA shotgun sequencing techniques are successful because they are overly simplistic, designed to capture low hanging fruit as quickly as possible, they don’t solve the hard problems – rather they get us faster down a road that leads to more questions. Questions that are likely too complicated for either search engines or cute biotech tricks to answer. Requiring experiments and analyses that are too intricate and error-sensitive…that need to be hand-held, coaxed and cajoled. Science in the real world is so different from the platonic model that is taught in schoolbooks. Failure is important, errors are crucial and we progress because human thought is remarkably adaptable and resilient in the face of this. Contrast this to the types of problems we will get when our analysis is guided by bug ridden computer algorithms, infested with worms, and the data is riddled with errors and spam.
Until the computing power and the algorithms which guide it, are truly evolutionarily designed, I don’t think science will learn much from the computer. When we do get the kind of AI that Chris and the Google founders are looking for, I suspect that they will find it impossible to clock that type of artificial intelligence at Gigahertz speeds, and that we may end up re-evolving a computer that looks and acts very similar to the human brain. At which point we may regret not using the ones we already have instead.
For the next stop on this train of thought, read the excellent article Is Google Making us Stupid? I’ve got one foot in the YES camp.
Addendum: the Wired article bothered me as an epitome of reductionist scientific thought. Reductionism by nature tends to focus on the simple problems, hard problems which are complex and expensive to tackle are avoided which leads to the amplification of reductionist techniques and causes. Sooner or later you might be convinced that all knowledge is within the reach of such reductionist approaches. There is a disturbing correlated trend for industry funding of scientific research to further skew science by leaving problems without obvious economic payoffs by the wayside. I would suggest that both industrial and reductionist science are represented in the Wired hypothesis.