Brief Guide to Statistical Manipulation          Locations of visitors to this page  


WWW PIVOT.NET       (whole words work best)

Site Map       Parent Level

Love ] Life ] grand_unified_perception ] Rational Thinking ] Freedom and Emergent Behavior ] New Planet Survey ] Politics and emergent behavior ]

Same Level

Inferential Distance ] Illumination ] Matrix ] Paranormal brain ] God Part ] Aliens Cause Global Warming ] [ Brief Guide to Statistical Manipulation ] Reality through stained glass ]

Child Level

Quick note:
Proving my point on rational thinking, the author shows his own improper statistics, confused rational thinking, and political bias in this article.  However, I find his skepticism appropriate and instructive.
I'll find a better article eventually, so read this with a cautious eye, and as food for thought.   Some of his basic concepts are valid, particularly that statistics DON'T lie, but people lie with improper statistics.

A Brief Guide to Statistical Manipulation

by Sam Sachdev   May, 2004

Try and think of a statistic you’ve recently seen, heard, or read. Maybe it was from a commercial. Three out of four dentists recommend Dentine. Or, maybe it was from the news. The President’s approval rating has fallen five percent. Or, maybe it was from a health article from a newspaper. Frequent use of antibiotics greatly increase your chance of cancer. The power of statistics, perhaps, is because of their quick and efficient means to present a conclusion persuasively. Their effectiveness, however, can be used to mislead, intentionally or otherwise. They can be used, for instance, to distort facts or only present a limited amount of data that helps to support a conclusion. Or, it could be unintentional manipulation. If you however have a basic understanding of how they’re misused or how they’re calculated, you can alert yourself to the most common and basic statistical manipulation.

The first, and probably most important, example of how statistics are used improperly is the difference between what, usually, the media considers worthy and what statisticians, psychologists and others who professionally use statistics do. The President’s approval rating, for instance, is a widely reported statistic. This statistic, however, is only likely to be reported if it suddenly rises or falls from its expected pattern. Tom Smith, a statistician at the University of Chicago, explains why this is particularly unreliable. “The media considers [a statistic] newsworthy because it’s different from what the most recent figures were. But numbers that are most unusual are likely to be the most error prone. It’s a systematic problem. The media are attracted to results that are the least reliable,” said Smith.

A poll, study, or any other statistic, then, is only considered reliable when its results have been replicated. There are many possible explanations. A poll, for instance, could be influenced by a national holiday, which could cause the President’s approval to suddenly, and without apparent reason, rise. Or, there was a subtle bias in the questions. And, one of the most important reasons is chance. Regardless of how well designed a poll or study is designed, it’s possible that chance could be the cause of the results. The causation of smoking to cancer, for instance, was only confirmed after many studies repeated the results. If you hear or read about a poll, study, or any other statistic whose results haven’t been repeated, it’s unlikely that researchers are going to take the results seriously.

Another common mistake, when the media reports statistics, is confusing a correlation for causation. Cooper Holmes, in “The Honest Truth About Lying with Statistics”, presents a common sense example to help explain the difference. There is, in the Spring and Summer, a high correlation between newly planted trees and the rate of the growth of the grass around them. The trees, of course, aren’t causing the grass to grow at a fast rate. The cause, obviously, is water and sunshine, not the trees. Or, consider this starker example. A friend of this journalist found that there was a high correlation between the number of car accidents in Florida and the amount of rain in Japan. There, however, isn’t causation between the two.

A correlation, then, is only the statistical relationship between two events. It doesn’t mean that one event is causing the other. Joel Best, a sociology professor at the University of Delaware and author of “Damned Lies and Statistics”, points out that confusing correlation and causation is often reported whenever the weekly medical journals come out. As an hypothetical example, he notes an example between eating Brussel Sprouts and reducing your chance of cancer. “[A] news story is always written as the Brussel Sprouts are going to help you reduce the chance of cancer,” said Best. In order for this to be causation, that the Brussel Sprouts actually helps to prevent cancer, it has to be explain why this is so. So, the next time you hear or read about research that states that there’s a correlation between two events, remember that this doesn’t mean that there’s a cause and effect relationship.

Statistical significance, the next example, is also a persuasive problem when the media presents statistics. Statistical significance, however, probably because it’s difficult to explain to those who aren’t familiar with statistics, is almost entirely absent from the presentation of statistics. In the evening news, for instance, you hear that President Bush’s approval rating has fallen five percentage points. The presenter of the information, almost certainly, isn’t likely to tell you that the results aren’t statistically significant. That is, it wasn’t explained that the five percent difference is probably only because of chance. Or, President Bush’s drop in the polls is probably meaningless.

Statistical significance means that there’s a relatively small chance that the results are because of chance only. Or, to state it more clearly, it means that if, say, the poll were repeated again there’s a large probability that the results could be repeated. Tom Smith, referring to a presidential approval rating poll, points out that without this information, the poll may be meaningless. “The presenter of the poll didn’t present that [it was] marginally...significant. The language should’ve been more circumspect...[This poll] shows a gain of five percentage points but it’s not statistically significance.” Or, to state it another way, imagine if the anchor of your nightly news told you that President Bush’s approval rating dropped, but it’s mostly likely only because of chance only. The poll, therefore, should be ignored.

Unfortunately, because there isn’t any mathematical calculation to determine what’s statistically significant and what’s not, it’s definition is hard to quantify. [John's Note] Nonetheless, it is associated with correlations. If there’s a strong correlation, there’s likely to be statistical significance. This, however, doesn’t assure that it is. There still has to be the likelihood that the results weren’t because of chance, a cause and effect relationship was proven, or the results were repeated. What’s important to keep in mind is that chance is an important concern when evaluating results. The next time you hear a statistic, say, the President’s approval rating, and the presenter doesn’t tell you if it’s statistically significant, keep in mind that the results of the poll could only be because of chance only.

The next problem in polls, agreeing upon definitions, is also a common problem outside of statistics, in marriage, politics, work places, or probably anything that requires communication. Cooper Holmes points out that when one place where he’s noticed this problem is in advertisements for drug/alcohol treatment programs. Most treatment programs, Holmes notes, advertize their success rates, such as 90% are drug-free after completing the program. The definition of “drug-free” could be simply completing the program or not using drugs or alcohol for one month. Most likely, the definition is one that favors the treatment program. It’s not one, however, that asks the question if the time spent sober is long enough to warrant a program that’s effective.

In statistical studies, definitions can lead to misunderstanding about the reported results. In a survey that reported on violence in schools, Tom Smith points out, there was a large difference between how most normally understood how the term “violence” is used and how the survey defined it. “[The survey] said that a majority of students had experienced violence. This was the number of students who had said yes one of a number of events. Among them were things like bumping into the hallway, shoving them or verbal abuse,” said Smith. Most, Smith notes, wouldn’t consider these events as violence. Rather, they would think that violence is, say, when a student is punched, kicked, or otherwise attacked. The result of the study was, without understanding how “violence” was defined, likely to give the impression that violence in schools was a serious and largely unreported problem. So, the next time you hear that poverty has gone up, or unemployment down, you should understand that your definition might be quite different from how it’s actually defined.

In this last example, researchers and the media selectively choose a conclusion from the data. This, infamously, is described by an example in “The Honest Truth About Lying with Statistics”. A psychologist wanted to see how accurately patients who are admitted into a mental hospital are evaluated. He had students, who were not mentally ill, act as if they were, enough so, anyway, to get admitted. Once admitted, however, they were told to behave as they normally do. That is, behavior that would be considered “normal”. The results were shocking and are described in the famous 1973 study “On Being Sane in Insane Places”. Many of the students couldn’t convince psychologists that they were “normal”.

The psychologists, in this example, selectively chose from data. Although, according to Cooper Holmes, the failure to correctly diagnosis the students who were acting as patients is somewhat controversial, nonetheless the practice of selectively choosing from data is so common, he only trusts conclusions when he himself can examine the data.

In a more common example, Tom Smith points out, a poll, twenty years ago, commissioned by an anti-gun control lobbying group. The group wanted to present evidence that the public wasn’t in favor of gun control legislation. “There was maybe twenty questions on gun control. They basically found one question that had gun control marginals and reported that,” said Smith. Unless the data is examined, there probably isn’t any way to determine if what the media is presenting you is what the data says it is. Nonetheless, the wariness of this seemingly common deception should make one aware of how a conclusion can be true and not represent the data.

The next time, then, you see an ad, or hear about an approval rating, or any statistic in the media, you should understand that there’s probably much more about the statistic than either the statistic can tell or the presenter of the statistic wants, or knows, to reveal.

John's note:
The author's statement "there isn’t any mathematical calculation to determine what’s statistically significant"  is wildly incorrect.  Of course the math exists and is used as a matter of routine.  However, the point he is trying to make is valid because the multivariate data to support statistical significance is very often missing.  In the absence of data, people mistakenly tend to give significance to correlation because: (1) they don't understand correlation vs. cause and effect; (2) they don't want to wait for real answers; and (3) they believe what they want, in spite of the data.
Press back to return to text

   Brief Guide to Statistical Manipulation     

WWW PIVOT.NET       (whole words work best)

Site Map       Parent Level

Love ] Life ] grand_unified_perception ] Rational Thinking ] Freedom and Emergent Behavior ] New Planet Survey ] Politics and emergent behavior ]

Same Level

Inferential Distance ] Illumination ] Matrix ] Paranormal brain ] God Part ] Aliens Cause Global Warming ] [ Brief Guide to Statistical Manipulation ] Reality through stained glass ]

Child Level