Analytic modesty in the face of poor performance

I put up links to some of my posts on various LinkedIn groups in hopes that people will stumble upon them and join the discussion. Here’s a recent comment (from the Social/Behavioral Science & Security group), in response to my post on my problematic relationship with theory:

“Human intuition just isn’t all that good when it comes to understanding problems that have lots of moving parts”, well, neither is science. It took several hundred years for a mathematician (Poincare) to solve the two-body-problem, and all of our joint computing powers haven’t yet solved the three-body-problem. And that’s a relatively straight forward sort of problem compared to solving the forces involved in human behaviour.

Frankly, I’m at a loss what exactly it is you’re proposing. You’re not keen on intuition, you no longer have faith in existing theories, nor in categorising into ‘tribal’, ‘cultural’ or whatever, you seem to have discarded the notions arising from ‘environmental constraints’ – if I read your blog post correctly – So, now what? And to what purpose?

In German a theory can be called a “Erklaerungsprinzip”, inherent to this concept is that theorising is necessary to name phenomena, then discuss these phenomena and make use of the paradigm created. When it’s no longer useful, or superseded by another, it may be discarded. It’s a dynamic, dialectic approach that progresses matters step by step, without any final outcome being a given. This also means that we will never truly know if our current level of knowledge gives the absolutely highest attainable level of clarity and insight. We will have to act to the best of our knowledge at a given time.

I want to first very briefly address the notion that science isn’t that good at making sense of hard problems. Then I want to take up the commenter’s question about what I was proposing. (To be honest, I wasn’t really proposing anything – my blog is a way for me to think out loud, not to call people to action. But since the commenter asked anyway, it seems like formulating my ideas into some sort of proposal may be a good thing for me to do.)

Science and Success

It seems ill-advised to cite the n-body problem in support of the notion that science isn’t good at making sense of complex problems. I’m not a physicist, so I’m happy to be corrected here, but as I understand it the n-body problem has actually been solvedRepeatedly. The problem is that the solutions are so computationally intensive that they have no practical use. That reminds me of the introduction to a neat little essay by Isaac Asimov. He wrote:

I received a letter the other day… The writer told me he…felt he needed to teach me science… It seemed that in one of my innumerable essays, I had expressed a certain gladness at living in a century in which we finally got the basis of the universe straight. […] The [writer]…went on to lecture me severely on the fact that in every century people have thought they understood the universe at last, and in every century they were proved to be wrong…

My answer to him was, “John, when people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together.”

Science has trouble figuring out complex problems, and our intuition has trouble figuring out complex problems, but it’s more than a stretch to claim that both means of understanding the world have anywhere close to equal trouble. I argued in my post that a lot of approaches to understanding human behavior are based more on intuition than on a systematic description. Even if we compare social science to medical science – which doesn’t exactly boast a shining string of successes (just a couple of examples) – it’s hard for me to come to any conclusion other than that, speaking generally and not necessarily calling out any specific lines of inquiry, social science just isn’t as good at making sense of its stuff as other sciences are at making sense of their stuff.

And I don’t buy the argument that human behavior is so much more complex than the things other sciences study. Rigorous statistical descriptions of behavior have shown a remarkable amount of patterning and consistency. The argument that people are just more complicated than other stuff seems to be an argument always made by assumption, not by evidence. If we (social scientists) want to complain that our job is so much harder than the physicists’ and chemists’ and biologists’ jobs, then we ought to make the explicit case for that complaint rather than just state it as a given.

Some Modest Proposals for an Immature Science

If I knew what to do about all of social science’s difficulties, I’d probably be doing that instead of blogging. I think it’s useful to point out what’s wrong even when I don’t know how to make it right, since problems usually don’t get addressed if no one talks about them. But I recognize that the following proposals will probably seem quite simple, if not downright simplistic. I guess we’ve got to start somewhere.

  1.  Stop focusing so much on people. People are interesting. I like people. But understanding people as people seems to be more of a down-the-road aspiration than a realistic research goal. Back when I was an anthropology undergraduate, our textbooks used to talk a lot about “material culture” – all of the physical by-products of people’s behavior. When people act, they often (not always) leave behind some trace of that behavior. Archaeologists have focused on this kind of stuff for a long time, maybe because they had no other choice. Maybe it’s time we focused on recording physical evidences of behavior just because there seemingly exists less debate about whether those physical things actually exist.
  2. Clean the data, for crying out loud! I’m working on a market segmentation project right now that pulls in bi-annually compiled customer data from the past few years. All of these data sets were kept in different places. Different data sets marked the same variables with different headings. In some cases, the variables changed from binary indicators in one data set to count data in the next. I’m not asking for a lot: consistent naming and coding of variables, and a clear record of when and how things changed (because the realities of any long-term project demand that things change from time to time). I’ve spent the last week doing practically nothing between the hours of 8:00 and 5:00 except writing code to merge data sets, checking the merged data sets, and then revising the code to handle yet another irregularity in the data that I hadn’t know about.
  3. Don’t let research problems drive data management. Still on the example of my segmentation project: on the one hand, I know that these data sets were not originally compiled or collected to do a segmentation study – they were brought in to mine very specific information, and then set aside once that information was gleaned. On the other hand, it is painfully obvious to me just how easy it would have been to standardize and link up each of these data sets as they were obtained. It’s time we – “we” meaning “people who study humans” – recognized that most of our data is too messy, incomplete, or collected under too-problematic sampling conditions to be of much use by itself. Our data needs other data, so while we can and should use whatever information we get our hands on to do our own limited individual studies, we can and should also put our data into a format that will make it as easy and intuitive as possible for someone in the future to pick up and use. This is true even if we don’t plan on making our data public. It can still be used by other people in our own organizations. If we only use our data to answer whatever problem is in front of us at the moment, we hinder – or even prevent – those future opportunities.
  4. Stop feeling ashamed of descriptive research. Yes, I know we would all love to be able to explain some sort of behavior. Good explanation requires good theory, and good theory is in very short supply right now. We can settle for poor-to-mediocre explanation, or we can do high-quality description. We don’t need to be able to say why people did what they did. Even in policy-making or business cases where our analyses feeds decision making, we can get a whole lot of mileage out of a time-stamped, geo-located record of what people did. And we can use that descriptive data to do a whole lot more than plot the points to a map. If we have good descriptive data and we haven’t done anything more with it than make a timeline, a heat map, or an ethnography/journalism-style narrative write-up, then we should go back and do a more rigorous descriptive analysis. If we don’t have data that allows that kind of rigorous analysis, then we ought to go out and collect that data. Either way, explanation can wait.
  5. Share, replicate, share, replicate, share, replicate. Except in cases where legal or ethical considerations prevent the sharing of data sets, we ought to be making all of our data public. This isn’t just a “data should be free” kind of idealistic statement. I agree with Sanjay Srivastava that “Our standard response to a paper in Science, Nature, or Psychological Science [or any other publication venue] should be “wow, that’ll be really interesting if it replicates.” And I agree with Andrew Gelman that “Replication is costly… It’s easy to say that something is ‘vital’ and that other people should do it. It’s not so easy to devote the time and effort to doing it oneself. Which suggests that the benefits of said activity do not necessarily exceed its costs.” We can reduce the costs of replication by making our data easy to access, easy to use, and thereby make it easy to critique.

I’m sympathetic to the idea that formulating and articulating ideas about how the world might work can gradually improve our understanding of how the world actually works. I think those kinds of debates can be useful in and of themselves, but I think are just as likely to be detrimental. It’s a very fine line between theory building and plain-old storytelling. I’d feel a lot better about the whole situation if I thought we had at least as many solid, rigorously-collected data sets as we have theories of human behavior, but I don’t think we do.

The commenter at the beginning of this post said, “We will have to act to the best of our knowledge at a given time.” I agree, with the caveat that, whenever we feel we need to act, we need to be able to first state, explicitly, what the best of our knowledge is at that point in time. Ignorance is rarely an excuse for action. If we can’t point to data – and preferably to a model of the data as well – to support one course of action over another, then all we have is our ignorance. In that case, I wonder if we’re not better off maintaining whatever course we’re currently on instead of trying to change a world we so poorly understand.