Monday, June 10, 2013

Math and Science

Mathematics is to science what ketchup is to food - it improves the taste of otherwise unpalatable dishes, but it kills more subtle flavors of everything else. This is particularly true of social sciences, where the availability of cheap computer numerical data manipulation programs fundamentally altered not only the direction of research, but also what kinds of data are being collected.

Since qualitative data are more difficult to process by computer software, their collection often takes the back seat in favor of quantitative – or rather pseudo-quantitative - data collected by opinion surveys.  They are pseudo-quantitative, because they use numerical scales representing intensity (e.g. strongly agree, somewhat agree, neither agree not disagree, etc.), but they cannot be processed as “real” numbers. 

For “real” numbers, such as 1,2, 3, 4 etc. we can say that the difference between 1 and 2 is the same as that between 3 and 4, and that 4 is twice as big as 2.  However, when those numbers are being used as mere symbols representing multiple choices in an opinion survey, they cease to be “real” numbers.  They can be replaced with letters a,b,c,d, etc. or even pictograms representing different choices cooked up by survey designers. The reason why they are not “real” numbers but pictograms is that we cannot say that a distance between choice a and choice b (e.g. strongly agree and moderately agree) is the same as between b and c (moderately agree and neither agree nor disagree). 

Research shows that subjective perceptions of quantities themselves differ from their numerical properties.  For example, a 5 percent change in probability is perceived differently depending on the overall probability of an outcome (i.e. whether it is 10%, 50% or 90%).  When it comes to opinions and perceptions, that level of subjectivity is even higher.  For example, if I only “moderately agree” with an opinion on, say, capital punishment, it may not take much to persuade me to be an agnostic (neither agree nor disagree).  However, if I have a strong feeling (strongly agree or strongly disagree), it typically takes much more to move me into the “moderate agreement/disagreement” direction. 

Yet, assigning numbers to these options creates a false illusion that they represent numerical quantities.  More conscientious researchers may refrain from treating them like “real” numbers and limit their analysis to reporting frequency counts, but the availability of cheap data processing software make such analysis look “pedestrian” and a pressure is applied to use more “advanced” techniques.  I am speaking from experience here.  Some time ago, an anonymous peer reviewer of my paper using frequency-based contingency tables showing distributions of opinions collected in a survey called this technique “pedestrian” and suggested one based on regression.  In other words, let’s treat them as “real” numbers. This advice reminds me of the old economist joke – he could not find a can opener on an uninhabited island, so he assumed he had one. 

The problem is not limited to the assumptions about quantitative properties of the data, but the kind of research that gains dominance in social sciences with the advent of cheap computational tools.  This new research paradigm favors questions that can be answered by numerical or quasi-numerical data, because such data are easy to collect and process.  Hence the proliferation of various opinion surveys.  The idiocy of this approach lies not only in the misinterpretation of numerical data, but more importantly, in intellectual laziness is fosters.  Researchers abandon the difficult intellectual task of trying to understand how people think and under what conditions in favor of giving them simplistic multiple choice tests involving pre-fabricated opinion statements, because such simplistic multiple choice tests are easy to score and process by computers.  If this is not the proverbial drunkard’s search, I do not know what is.

Another implication of this observation is that science, or at least social science, is not progress achieved by systematic testing of scientific theories as Karl Popper believed, but rather movements between what Imre Lakatos called “scientific research programmes.”  The purpose of a scientific research programme is not theory testing, as Popper believed, but ‘problem shift” – that is, the construction of auxiliary hypotheses that render contradicting evidence irrelevant to save core assumptions of a favored theory from empirical refutation.  Problem shifts may take the form of crude “gate keeping” of the orthodoxy, for example in economics, as decried by John Kenneth Galbraith, or more subtle forms, such as changes in academic fads or the availability of new instruments of scientific research. 

The use of computer software utilizing mathematical analysis in social science represents such a problem shift due to new tools.  The problems researched and theories proposed to explain them tend to be limited to those that lend themselves to being processed by computerized tools.  This puts social science on the trajectory to become what theology was in the Middle Ages, an impressive logically coherent intellectual edifice whose empirical relevance and predictive power is on a par with that of a chimp randomly pushing computer buttons.