However, upon segregating the data into different age-groups, the results revealed that smokers in all-but-one categories had higher mortality rates. This is obviously counter-intuitive and extremely surprising. Advertisers often use this fallacy to their advantage by cherry-picking data clusters to suit their arguments, or by establishing patterns to fit existing perceptions.Ī 1996 study on the effects of smoking on women revealed higher mortality rates for non-smokers. As the field of Data Science receives greater scrutiny over time (which, I think, is essential for both technological and process maturity), practitioners are bound to exercise more caution to prevent the adverse influence of this fallacy in their Data Science projects. There has been a lot of research and innovation, particularly in this decade, to address the effects of this fallacy. The rationality for such action is generally attributed to intellectual-fraud, behavioural biases and, at times, honest errors.īoth new and experienced Data Scientists are susceptible to the two aspects of the Texas Sharpshooter Fallacy, and must take adequate measures to guard against the same. The second version of the fallacy (also known as P-hacking) pertains to the act of conducting multiple tests to prove or disprove certain hypotheses, but reporting the results of only those tests with favourable or low p-values, while largely ignoring the results of the others. The rationality for such action is generally attributed to the absence of adequate data (for analysis), behavioural biases, over-reliance on past results & experiences, and intellectual-laziness. The first version of the fallacy (also related to the Clustering Illusion) is about establishing specific patterns after weak or even negligible data analysis, and then 'processing or transforming' the available data and 'structuring' new theories to force-fit them into those patterns. This fallacy is one of the most widely prevalent Data Science practitioner mistakes. If your first impression is that this scenario has nothing to do with the Data Science practice, think again. Post that, he erases all the other bull's-eyes that he had painted, and proudly displays the one with the bullet as proof of his sharpshooting skills. It is obvious that he would hit one of these various targets. In another version of this fallacy, the shooter first paints multiple bull's-eyes at the side of a barn, and then fires a bullet at them. This scenario is known as the Texas Sharpshooting Fallacy. He then showcases his 'marksmanship' to the world, and gets widely appreciated for his skills. Imagine a below-average shooter (the original story has a Texan) randomly firing at the side of a barn, and then painting bull's-eyes around the tightest clusters of holes made by his gunshots. For instance, the Nostradamus lines that supposedly predicted 9/11 were taken from three separate and unrelated passages and a fictional line was added."If you torture the data long enough, it will confess." Ronald Coase, Nobel Prize Winner in Economics, 1991. Nostradamus' quatrains are often liberally translated from the original (archaic) French, stripped of their historical context, and then applied to support the conclusion that Nostradamus predicted a given modern-day event, after the event actually occurred.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |