Life, Art, and Economics

Dave Schuler May 24, 2014

Defects in data occur frequently. You may have too few samples; some may be anomalous. Dealing with these defects may be art form as much as science and the methods for doing that include interpolation, extrapolation, and smoothing. Interpolation is when you construct new data points within the range of known data points. When you draw a line between two data points, you’re interpolating.

Extrapolation is the process of estimating, based on the data points you actually have, the values that might be observed beyond the observed range. When you draw a curve longer than the data you actually have, you’re extrapolating.

Smoothing is when you make the data points you have look better. Sort of like when you blur the focus on a photo of an old lady. It’s doing what you think nature would have done if she had a better sense of aesthetics. When a graphic artist uses Photoshop to make a model appear taller or thinner or give her a narrower waist, he or she is actually creating a picture of a model who does not exist in real life.

There’s something of a brouhaha going on now in the econblogosphere. Thomas Piketty, author of Capital in the Twenty-First Century, the book that’s causing a stir in econ circles days, is being accused of photoshopping his data:

Thomas Pikettyâ€™s book, â€˜Capital in the Twenty-First Centuryâ€™, has been the publishing sensation of the year. Its thesis of rising inequality tapped into the zeitgeist and electrified the post-financial crisis public policy debate.

But, according to a Financial Times investigation, the rock-star French economist appears to have got his sums wrong.

The data underpinning Professor Pikettyâ€™s 577-page tome, which has dominated best-seller lists in recent weeks, contain a series of errors that skew his findings. The FT found mistakes and unexplained entries in his spreadsheets, similar to those which last year undermined the work on public debt and growth of Carmen Reinhart and Kenneth Rogoff.

The details are in the linked article but some of the things found include anomalies, just plain errors, and what appear to be smoothing. How damaging are the discoveries?

For example, once the FT cleaned up and simplified the data, the European numbers do not show any tendency towards rising wealth inequality after 1970. An independent specialist in measuring inequality shared the FTâ€™s concerns.

Dr. Piketty, of course, rejects the notion that his data manipulations were deliberate or intended to deceive.

The data manipulations don’t prove that his conclusions are wrong but his retort or the defense of his supporters don’t prove that they’re right, either. I suspect this is something that will be debated for years.

3 comments… add one

steve Link

If they wanted to commit journalism, LOL, they could interview him and ask how he decided to do what he did. I thought he was fairly explicit in his explanations that he was working with older, sometimes incomplete data and had to weight and average data sometimes. The god new is that he put it all on line.

Steve
Ben Wolf Link

It’s been known since the book was published that Picketty played fast and loose with the underlying theory (more accurately, hypothesis) to get the conclusion he got. I wish I could say I don’t believe he would do the same on the statistical side of things but I can’t.
Guarneri Link

If you go over to zerohedge you can read about a re-look at the data which shows nothing short of a total whiff by Piketty on the data – the supposed strength of his work (and I mean a complete whiff) – and it seems to be the better analysis. Interestingly, but not surprisingly, it references how Piketty defenders are now turning tail and running from the data and marveling at the “theoretical possibility” of Piketty’s work.

This reminds us all too clearly of the AGW debate. No, the theory doesn’t actually accurately predict anything in the real world, but you know, theoretically, CO2 can affect heat transfer dynamics and boy oh boy, if it was right…………