The original analysis on the Data Colada blog identified several inconsistencies in the data that showed it was fabricated data. For example, the data is a uniform distribution that goes from 0-50,000 and stops. This new analysis, raises additional questions about the data and asserts that the analysis shows how the fraudster was able to manipulate the data to support the hypotheses tested in the paper.
I recommend reading this analysis and deciding whether the author makes a compelling argument to support his claims. His ultimate claim is that the fraud had to be perpetrated by Ariely himself. He says: "Is it possible that someone at the car insurance company faked the data, and Dan Ariely simply received this fake data? I would say that it is not."
Bold claims require bold evidence. See what you think.
Great posts on this story, Mark! To me, DC's claims are bolder than their evidence warrants at this point. The evidence is strong, though there are "Ariely didn't do it" explanations for some of the red flags, eg, it's possible that insurance company personnel truncated the data at 50,000 (intentionally or unintentionally, perhaps any values over 50,000 happen to require a different query or are extracted from a different source). It's admittedly harder for me to develop similar explanations for, eg, the apparent lack of rounding in one font but not in another.
ReplyDeleteCan we state with confidence "Ariely did it?" Not yet.
Thanks Justin. I agree. I think he has some interesting and compelling evidence but is probably overstating the conclusions that we can draw from it at this point. I will say that my priors that Ariely would never do something like this have been dramatically revised so that now I'm leaning strongly toward the conclusion that he's guilty. Until there is some compelling evidence of his innocence, I certainly wouldn't trust him to manage my retirement accounts or babysit my grandchildren!
DeleteIt's not too surprising that a research celebrity would bungle a data fabrication task. They're in a hurry. Early career researchers usually handle all of their data work. They get where they are by finding cute effects and being a good storyteller--not by being an expert at basic algebra and spreadsheets. As far as the data fabrication goes, I thought it was a foregone conclusion that column A (sign at top/sign at bottom) was completely fabricated. It's important to note that column A was reversed (effect in opposite direction) when Mazar analyzed it. When she reached out to Ariely about it he told her that he had mislabeled it and that she should flip all the labels. How then is that column not completely fabricated given what we know about the fabrication of the second set of observations?
ReplyDelete