Sometimes statistics* seems magical. Of course, there isn’t really anything magical going on—but the fact that statistical methods can be used to pull useful information out of a hatful of messy data is, at a minimum, remarkable. And, as we’ll talk about today and next time, it’s extremely useful too.
Today’s sponsor is the AARP Auto Insurance Program from The Hartford.
Check them out today and get an 8-minute quote at aarp.thehartford.com/podcast
*Statistic Disclaimer
Before we get started, I want to point out that the things called statistics that we’re going to talk about today are a part of, but different than the field of statistics, which is the science of collecting, sorting, organizing, and generally making sense of data.
A Statistical Thought Experiment
Okay, with that out of the way, let’s start off with a thought experiment. Imagine you’re handed a bag containing a bunch of tiles that are all carved into the shapes of integers. Someone else prepared the bag, so you have no idea how many total integer tiles are in it. You do, however, know that the first tile put in the bag was shaped like the number 1, the second like the number 2, and so on, and that the last tile put in the bag was, therefore, in the shape of the total number of tiles. Your task in this experiment is to randomly pull six integer tiles out of the bag, and then to use these integers to estimate the total number of tiles in the bag (which, remember, you don’t know beforehand). And, if you think about it for a minute, you’ll see that this is identical to asking you to estimate the value of the largest integer in the bag. So, how do you do it? No big surprise, the answer has something to do with today’s main topics: statistics and estimators.
What is a Statistic?
A statistic is a quantity calculated from a sample of data that tells us something about the properties of that sample. To help us better understand what this means, let’s go back and think about the bag of integer shaped tiles. In that example, the entire group of integers in the bag is called the “population,” and the six integers you pulled out of the bag are called a “sample.” There are, of course, many possible samples besides the one you pulled. For example, you could have pulled six entirely different integers. That still would have been a sample drawn from the population, but it would have been a different sample. So, in this case, a statistic is some number you can calculate from the six integers you pulled out of the bag that tells you something about those numbers.
Okay, so what does an actual statistic look like? Well, the minimum value of your six integer sample is one example of a statistic, and the maximum value of that sample is another. And, these statistics can be used to infer information about the six integer sample. For example, if you subtract the minimum value from the maximum value, you learn the range of the sample. Neither of these statistics is very useful for inferring information about the population of tiles as a whole though. For example, the range of the entire population could be very different than the range of the sample, since the bag could contain much higher or lower integers than are contained in the sample. So, is there some way to learn about an entire population from only a sampling of its data? There is—it’s called an estimator.
What are Estimators
An estimator is a rule telling you how to calculate a special type of statistic that tells you not only about the properties of a sample of data, but also about the properties of the entire population from which the sample was drawn. For example, the mean value of the sample of six integers in our thought experiment is called the “sample mean,” and the sample mean is an estimator for what’s called the “population mean”—which, logically, is the mean value of the entire population. Why is the sample mean an estimator for the population mean? Basically, it’s because as long as the sample of integers was drawn out of the bag randomly, the sample should give a fair representation of the overall variety of values in the entire population—so the two mean values will be similar. Think of an estimator as something that prescribes a rule or algorithm for calculating its value. For example, the estimator for the population mean is a rule that says to add up all the values in the sample, and then divide this number by the number of objects in the sample—exactly as you would expect.
The Population Maximum
Okay, but what do estimators have to do with our problem—how might they help us estimate the largest integer in the bag? Well, think about it: we have a sample of data from a population—that is, we have six numbers (the sample) pulled from all the integers in the bag (the population). And we want to use these six numbers in our sample to calculate a quantity (that is, a statistic) that we can then use to infer some piece of information about the entire population (in other words: how big is it?). What is a quantity like this called? Well, it’s an estimator! So, we need to find an estimator rule that will help us estimate the value of the largest integer in the bag using only the values in the sample. This estimator is called the population maximum.
So, what rule do we use to calculate the population maximum? Well, unfortunately, the answer is going to have to wait until next time because we’re out of time for today. But, in the meantime, think about what rule such an estimator might use. Perhaps the best estimate of the population maximum is found by taking two times the biggest number in the sample? Or maybe it’s twice the mean or median of all the numbers in the sample? What do you think? Give it some thought, and then be sure to check out the next article to find out.And, just in case you’re thinking all this estimator business is a bunch of esoteric nonsense, next time we’re also going to talk about a real world example known as the “German tank problem” from World War II that will show you that they are very real indeed.
Wrap Up
Please email your math questions and comments to................You can get updates about the Math Dude podcast, the “Video Extra!” episodes on YouTube, and all my other musings about math, science, and life in general by following me on Twitter. And don’t forget to join our great community of social networking math fans by becoming a fan of the Math Dude on Facebook.
Until next time, this is Jason Marshall with The Math Dude’s Quick and Dirty Tips to Make Math Easier. Thanks for reading, math fans!