I’m taking statistics for my degree right now. I’m not that bad at it actually. I can handle the hand calculations great. I have a bit of a problem with the numbers the computer spits out at me and figuring out what they mean, but I’m not that bad.
So this summer when I saw this article in the New York Times: To Be Young (Like 9, on Average?) and Homeless, I thought it was interesting in that it talked about how statistics can be manipulated and used in ways that are not exactly ethical. I filed the article away in my memory and did not think much more about it.
That is, until I got a flier on my door that said this:
Um, see, now I know that’s not true.
So I emailed the agency listed on the flier and asked them where they got their statistic.
They told me they got it from a state wide agency.
So I emailed the state wide agency and asked them where they got it.
They told me they got it from a national agency, the Coalition for the Homeless, that had put out a video with the statistic and that the statistic had been used in a variety of news reports across the nation. They also told me it was a national statistic and not a state one.
Well, now the story is getting interesting.
I replied telling them it is being used as a state statistic and they should probably look into that. I also told them I knew about the video (the New York Times article was all about the video) and wanted to know where they got their data and how it was collected. Just because the agency is bigger or national doesn’t make the numbers they put out any more valid.
I got another reply today.
When it [the statistic] came out a couple of years ago, this statistic was quickly picked up by most everyone who works with homeless persons. When our development department started to quote the statistic, I did some of my own calculations and determined it was virtually impossible to be true, as it was stated. At that point, and since, we have tried to find the real source and identify raw data that resulted in the statistic, but have not been able to do so. I indicated in my email that “we have become more cautious about using it” – that is code word to explain that we removed any reference to this from our materials and presentations between 18 & 24 months ago. It is possible that someone who speaks on our behalf picked it up at some point and has used it since.
Outside of our agency, it is still widely used, most recently reported by ABC news stations stating that it was an ABC News study – can’t quite figure out what they studied.
It’s apparent that, at some point, the [local agency] heard that statistic from us – it must be sometime 1 1/2 to 2 years ago. How it got turned into an Arizona statistic rather than a nationwide statistic, I don’t know, since it was never used by us in that way. Before receiving your email, I was not aware that they were using it in print anywhere. We have now contacted them to request that they change their materials to use well established statistics.
So my mission in life – to change the world by correcting faulty statistics, is off to a good start.
Morons.
Oh, the ways that one can lie with statistics!
The most common stat cheats I’ve seen include:
1. “Massaging” the number of standard deviations around the mean (impacts what would and would not pass a “Z” test). Used a lot when organizations that have an agenda do “research”. It’s a sneaky way of making an outcome look more or less likely than it really would be.
2. Changing the confidence interval – The higher the confidence interval, the bigger sample size required. No sense actually doing the research and gathering the sample size to get a 98 percent CI when an 85 or 90 percent CI will do! Political polling companies do this all the time.
3. Focusing on statistical outliers / Not properly sanitizing the data sample. Involves running statistical operations on data that has not been filtered to follow a normal distribution. Goes hand-in-hand with the two above.
Sounds like the folks that published this work of fiction might be guilty of all three!
Again with statistics being off, there’s this blog entry about how do you define parents – The Loaded Language of Parenting.
The comments really get to it. If when I am no longer a student and no longer being paid as a graduate assistant, but instead am a wife and mother, do I put “unemployed” on forms and make the numbers in that category rise? “Unemployed” generally means, at least to me, that they are relying on society for their support and would like to be in the work force but can’t. But that doesn’t describe someone who is not relying on society nor wants to be in the work force. Makes me wonder just how off “unemployment” numbers can be.
Those in high public offices (i.e. President of the United States) should certainly be careful of the statistics they use to try to make their points, especially when those statistics can be so easily analyzed and shown to be wrong.
Examining Obama’s Education Numbers
That’s all I’m saying.