The Model Must Be Right First

I ultimately found this article at Scientific American from David Freedman (hat tip: Glenn Reynolds) about the shortcomings of models a bit unsatisfying. The article is about the experience of, presumably, geology grad student Jonathan Carter:

Carter wanted to observe what happens to models when they’re slightly flawed–that is, when they don’t get the physics just right. But doing so required having a perfect model to establish a baseline. So Carter set up a model that described the conditions of a hypothetical oil field, and simply declared the model to perfectly represent what would happen in that field–since the field was hypothetical, he could take the physics to be whatever the model said it was. Then he had his perfect model generate three years of data of what would happen. This data then represented perfect data. So far so good.

The next step was “calibrating” the model. Almost all models have parameters that have to be adjusted to make a model applicable to the specific conditions to which it’s being applied–the spring constant in Hooke’s law, for example, or the resistance in an electrical circuit. Calibrating a complex model for which parameters can’t be directly measured usually involves taking historical data, and, enlisting various computational techniques, adjusting the parameters so that the model would have “predicted” that historical data. At that point the model is considered calibrated, and should predict in theory what will happen going forward.

Carter had initially used arbitrary parameters in his perfect model to generate perfect data, but now, in order to assess his model in a realistic way, he threw those parameters out and used standard calibration techniques to match his perfect model to his perfect data. It was supposed to be a formality–he assumed, reasonably, that the process would simply produce the same parameters that had been used to produce the data in the first place. But it didn’t. It turned out that there were many different sets of parameters that seemed to fit the historical data. And that made sense, he realized–given a mathematical expression with many terms and parameters in it, and thus many different ways to add up to the same single result, you’d expect there to be different ways to tweak the parameters so that they can produce similar sets of data over some limited time period.

The reason I found the article unsatisfying is that, far from explaining why economic models in particular fall short, the article actually asserted that regardless of what you’re modeling incorrect models can’t be corrected by recalibration. However, since solid models do, in fact, exist and are employed successfully every day something else must be going on in economic models.

I think there are several key takeaways from this. First and foremost the model must be right and you determine that a model is right by its predictive power without recalibration over time. Repeated recalibration doesn’t make a flawed model better.

However, accurate, reliable models exist and are used every day. How can this be? The answer is that the models were solid to begin with and we know that by their predictive power. Producing sound models requires judgment, experience, and the elusive thing called insight.

The perihelion precession of the planet Mercury does not follow the Newtonian principles. The Newtonian model does not predict its orbit correctly. It could not be accounted for by recalibration (they tried). Accounting for the deviations from the Newtonian model required the then-new insight of Einstein’s general relativity.

Why aren’t economic models, the models attacked in the title of the article, better? The early political economic philosophers like Adam Smith and David Ricardo had something that mathematical notation and mountains of data can’t provide: they had insight, in Ricardo’s case earned by working practically since infancy at the London stock exchange. But, importantly, they weren’t trying to produce quantitative results but qualitative explanations for behaviors.

Another problem is that the monetary and political stakes are very high and preferred outcomes may overwhelm analytical rigor.

In the final analysis I think we’re getting ahead of ourselves in demanding quantitative results from economics and there are good reasons to believe, since economics deals with purposeful actors rather than mechanical phenomena, that determinative, accurate, quantitative predictions may remain elusive forever. At the very least we’ll need considerably more insight than we have right now and, as I noted above, insight is elusive, too.

16 comments… add one
  • PD Shaw Link

    The components of a model of subsurface conditions are based upon physical science, any uncertainty is based upon incomplete knowledge of what is below the surface.

    The components of economic models are behavioral. I think the problem is the opposite, we can accumulate tons of data about people; we just have difficulty predicting and simplifying all of their individual interactions.

  • Zachriel Link

    David Freedman: It turned out that there were many different sets of parameters that seemed to fit the historical data.

    Of course. Depending on the equations, there are many sets of parameters that can fit the same data.

    Dave Schuler: Repeated recalibration doesn’t make a flawed model better.

    As long as it still fits the entirety of the data, then recalibration can work. Of course, if you fit it to the current data while ignoring the historical data, then not. More typically, you add epicycles.

    It’s important, also, to make a distinction between simple correlation to trends, and determining causation. So, while a geocentric model can be adjusted with epicycles, the reality of the where the center of the overall system is can only be determined by understanding the mechanism involved (gravity and momentum).

    Another problem with the analysis is that it is apparently assuming smooth functions. Many natural phenomena have discrete transitions. Ice isn’t just liquid water molecules slowed down. It’s something entirely different.

    David Freedman: Then he had his perfect model generate three years of data of what would happen. This data then represented perfect data. So far so good.

    Finally, the model ignores the problem of chaos. Even simple systems, where all the interactions are understood, such as in water turbulence, can’t be predicted in all detail ‘perfectly.’ Economics is a complex system with many parameters that can only be estimated. As such, prediction in detail is inherently impossible, though predictions of trends and other aspects of the system might be predictable. Rivers may be turbulent, but they still flow to the sea.

  • I wonder if anyone has done a study on the accuracy of predictive economic models – actually comparing what they predicted to what actually happened? Anyone know of such a thing?

  • The reason I found the article unsatisfying is that, far from explaining why economic models in particular fall short, the article actually asserted that regardless of what you’re modeling incorrect models can’t be corrected by recalibration. However, since solid models do, in fact, exist and are employed successfully every day something else must be going on in economic models.

    I think you are confused. The article is not talking about a economic model, but a physics model that is a simulation. Even there, the problem persists. Even when you have a perfect model, and then you want to simulate it, you have the problem of many different parameter values that provide “realistic” results.

    Basically, you have n equations but m unknowns and m > n. The possible number of solutions is very large.

    Now shift over to economics where you have no perfect model to start with and the problem becomes very much harder, even if we abstract from PD’s point (which many economists do–i.e. preferences are assumed fixed).

    And even if we assume preferences are fixed, if we think of people and their behavior in a game theoretic setting they could change strategies. For example, in Axelrod’s simulations tit-for-tat won, but, IIRC, one reason for that was the duration of the simulation. That is, a longer game might produce different strategies, and people can switch. This is where economists have, in some cases, turned to evolution. They us a survival of the fittest type of concept (abusing terminology here) for strategies. That is if a strategy isn’t providing sufficient success the individual can chuck it for another, if that is the case across the whole population that strategy “goes extinct”.

    Finally, the model ignores the problem of chaos. Even simple systems, where all the interactions are understood, such as in water turbulence, can’t be predicted in all detail ‘perfectly.’

    Meh, after talking to the guys at one of the BLS research divisions where they spent quite a bit of time looking at non-linear dynamics and “chaos”, the overall conclusion was, neat math but not that helpful. They just couldn’t find it. And by they I’m also referring to guys like Buz Brock as well. Not only that, but if you try to derive many of the standard economics results you get impossibility theorems. Since we have an economic that is functioning, maybe a bit dysfunctional right now, I’m not going to get worried about “chaos”.

    I wonder if anyone has done a study on the accuracy of predictive economic models – actually comparing what they predicted to what actually happened? Anyone know of such a thing?

    Yes, it is standard model validation. You can even do it with in sample data. Problem with economics though is what PD notes. If people’s behavior changes what was initially a great model might no longer be applicable. You wont know that till after the fact.

    One of the big problems with statistical/econometric models can be found here.

  • Regarding that last link, BTW, that critique is aimed at the very types of models that Moody’s.com uses. Mark Zandi uses large scale Keynesian macro econometric models. If you find the Lucas Critique compelling, and I certainly do, then you should view the results of those models with a great deal of skepticism….like passing Obama’s jobs bill will create 1.9 million jobs. Probably not is going to be my quick guess.

  • Zachriel Link

    Steve Verdon: Meh, after talking to the guys at one of the BLS research divisions where they spent quite a bit of time looking at non-linear dynamics and “chaos”, the overall conclusion was, neat math but not that helpful.

    That chaotic systems are analytically intractable is the entire point.

    Steve Verdon: Since we have an economic that is functioning, maybe a bit dysfunctional right now, I’m not going to get worried about “chaos”.

    “An economic that is functioning?” Do you mean theories of economics that work to some degree? Well, yes. The point about chaos concerns the simplification described in the original article. So, if the researcher starts with a system that can be perfectly predicted from a starting point that is approximated to any degree, then the researcher is ignoring chaos.

  • michael reynolds Link

    I think if you just subtract humans you can make great models.

  • That chaotic systems are analytically intractable is the entire point.

    It isn’t that they are intractable, but that they simply could not find much evidence for it at all, although it did help give us new statistical tests and such. Other than that, time to move on.

  • PD Shaw Link

    I don’t know if there are any baseball fans here, or consumers of sabermetrics . . . but this post reminds me that as the season comes to an end, all of the statistics based upon aggregations of data, and likelihood of reoccurrence over multiple occurrences become meaningless. Examining the fine-grain of a single baseball game, with each batter getting to the plate 3 to 4 times, the difference between the best and worst hitters is wiped clean. There is no predictive ability in the models for purposes of a single game; you just have to watch.

  • PD Shaw Link

    Er . . . there’s something in there about models having intended uses and functions.

  • Zachriel Link

    Steve Verdon: It isn’t that they are intractable, but that they simply could not find much evidence for it at all, although it did help give us new statistical tests and such. Other than that, time to move on.

    Huh? Of course there are chaotic systems, defined as dynamical systems that are sensitive to initial conditions. They were discovered by Lorenz in his study of weather. Other chaotic systems include the Earth’s climate and turbulent mixing.

    Lorenz, Deterministic Nonperiodic Flow, Journal of the Atmospheric Sciences 1963.

  • steve Link

    “I don’t know if there are any baseball fans here”

    Yes. Sadly, a Phillies fan.

    “. If people’s behavior changes what was initially a great model might no longer be applicable. You wont know that till after the fact.”

    You can certainly take the Austrian approach and, sort of, eliminate mathematical models. It then comes down to describing and predicting behaviors, but that involves making big normative judgments about behavior. One group assumes that if you raise taxes people will work less, another group says they will work more.

    Steve

  • PD Shaw Link

    The Cards owe a good deal of thanks to the Phillies for playing the Braves tough down the stretch, when it wasn’t really in their best interest. There might be something insightful about the complexity of human motivation there, but I think I’m drifting.

  • steve Link

    Yup. The Braves starting pitching was shot. They could have tanked and, likely, ended up in the Series. However, that’s not how you should play. At any rate, the Cards earned it. They peaked at the right time, especially the bullpen which I thought would let them down, and Pujols is the clutch hitter that Howard needs to be.

    Steve

  • Drew Link

    Imagine you have a basic oxygen furnance to make steel. Into it you pour molten iron with perfectly known chemistry. You also add just the right amount of scrap metal so that the exact amount of oxygen you blast into the pot will react with the carbon in the motlten iron to melt the scap and reach the perfect endpoint temperature. Lastly, you perfectly dump the slag materials to dissolve all te crap you don’t want from the refining reactions you are driving.

    You do those calculations in school. In the real worl, forget it. Just pouring the molten iron ore changes the carbon. The scrap metal is wet, and maybe has a few car seats still crushed up in it. The alloy bin didn’t quite get cleaned out correctly, and the ozygen nozzle is squirting a little sideways.

    And this is a fairly controlled environment compared to an economy. Hence my constant invokation of practitioners, and not theorists.

  • Tully Link

    Mechanistic models fail for several reasons. In econ, one of the prime reasons is what Dave alluded to in the final para of the post. You’re not dealing with a mechanistic universe, but with a dynamic/organic universe.

    There are other modeling problems that are universal, though. The combination of SDIC and imperfect knowledge is a prime factor. The more complicated the model, the greater the chance the “predictive” results are going to be partial or total garbage, and the odds of bad results increases with multiple iterations. Complicated models incorporating multiple sub-models are particularly vulnerable to rapid chaotic decomposition.

Leave a Comment