Statistics
The Air Force Research and Development Center describes statistics as “the science of collecting, analyzing, presenting, and interpreting data.”
The Babylonians were among the first to use statistics in 1600 BCE. They applied the results of their census to determine how much food was needed to feed their population. Later civilizations used census records to assess and collect taxes, and to raise armies.
The U.S. Constitution empowers the government to conduct a census every ten years in order to apportion direct taxes and representation in the House of Representatives. Today, census statistics are also used by the government to track trends in the U.S. economy, and to allocate appropriations for things like schools, roads, and entitlement programs. You can learn more about this at the US Census website at https://www.census.gov/.
In statistics, we talk about special concepts, like “randomness”, “distribution” and “sampling”. Something is random if it is equally likely to take on any of its possible values. So suppose we number ten balls sequentially numbered 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. If we put all the balls in an urn, and select one (without looking!), then it will randomly be any value from 1 to 10.
This brings us to the idea of a distribution. A distribution describes the frequency of each of the possible values occurring. In this example, we say the distribution is uniform because the probability of selecting any possible number is equally likely. We can express this idea using probability notation: p(selecting a 1) = p(selecting a 2) = … = p(selecting a 10) or 1/10.
One of the most common distributions is the normal distribution, where the X coordinate represents the value the sample takes on, and the Y coordinate represents the frequency with which this value occurs. This allows us to visually see the mean or average of the often bell-shaped curve and understand that there will be values greater than and less than the mean, usually with less frequency the further they get from the mean.
Sampling involves the way you collect data values during your statistical experiment. You need to know what kinds of information you want to collect, what the sources of your information will be, what sort of sampling will be used (random sampling is one desirable method), how the data will be collected (e.g. via questionnaire, door to door survey, etc.), when and how often the data will be collected, what you plan to compare the data to, and how you intend to analyze the data. You want to design your sampling strategy to try to eliminate any biases.
Poor sampling strategies have been the downfall of many statistical studies. They can provide unreliable results, invalidate your conclusions, and, if published, tarnish your reputation. Smart researchers consult statisticians on not just the sampling strategy, but the overall methodology for their study in hopes of avoiding such traps.
What have we learned about statistics? Statistics is not all that hard to understand. Statistical concepts like randomness, distributions, and sampling are straightforward. But statistics can be tricky to use correctly. Poorly designed and executed statistical studies can lead to spurious results. While most respectable researchers try to conduct reliable studies, biases can sneak in. Worse still, there are individuals out there who have every intention of misleading with statistics . . .
The Babylonians were among the first to use statistics in 1600 BCE. They applied the results of their census to determine how much food was needed to feed their population. Later civilizations used census records to assess and collect taxes, and to raise armies.
The U.S. Constitution empowers the government to conduct a census every ten years in order to apportion direct taxes and representation in the House of Representatives. Today, census statistics are also used by the government to track trends in the U.S. economy, and to allocate appropriations for things like schools, roads, and entitlement programs. You can learn more about this at the US Census website at https://www.census.gov/.
In statistics, we talk about special concepts, like “randomness”, “distribution” and “sampling”. Something is random if it is equally likely to take on any of its possible values. So suppose we number ten balls sequentially numbered 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. If we put all the balls in an urn, and select one (without looking!), then it will randomly be any value from 1 to 10.
This brings us to the idea of a distribution. A distribution describes the frequency of each of the possible values occurring. In this example, we say the distribution is uniform because the probability of selecting any possible number is equally likely. We can express this idea using probability notation: p(selecting a 1) = p(selecting a 2) = … = p(selecting a 10) or 1/10.
One of the most common distributions is the normal distribution, where the X coordinate represents the value the sample takes on, and the Y coordinate represents the frequency with which this value occurs. This allows us to visually see the mean or average of the often bell-shaped curve and understand that there will be values greater than and less than the mean, usually with less frequency the further they get from the mean.
Sampling involves the way you collect data values during your statistical experiment. You need to know what kinds of information you want to collect, what the sources of your information will be, what sort of sampling will be used (random sampling is one desirable method), how the data will be collected (e.g. via questionnaire, door to door survey, etc.), when and how often the data will be collected, what you plan to compare the data to, and how you intend to analyze the data. You want to design your sampling strategy to try to eliminate any biases.
Poor sampling strategies have been the downfall of many statistical studies. They can provide unreliable results, invalidate your conclusions, and, if published, tarnish your reputation. Smart researchers consult statisticians on not just the sampling strategy, but the overall methodology for their study in hopes of avoiding such traps.
What have we learned about statistics? Statistics is not all that hard to understand. Statistical concepts like randomness, distributions, and sampling are straightforward. But statistics can be tricky to use correctly. Poorly designed and executed statistical studies can lead to spurious results. While most respectable researchers try to conduct reliable studies, biases can sneak in. Worse still, there are individuals out there who have every intention of misleading with statistics . . .