Descriptive v/s Inferential Statistics: An Intuitive Note
- RDSTATISTICS

- Sep 2, 2020
- 3 min read
Updated: Dec 24, 2020
This tussle between the Descriptive Statistics and the Inferential Statistics is justified because we are helpless of the circumstances the data create. Let’s get into the intuitive note of Descriptive Statistics v/s Inferential Statistics. Descriptive statistics summarize data for a group that you choose. This process allows you to understand that specific set of observations.
Descriptive statistics describe a sample. That’s pretty straightforward. You simply take a group that you’re interested in, record data about the group members, and then use summary statistics and graphs to present the group properties. With descriptive statistics, there is no uncertainty because you are describing only the people or items that you actually measure. For instance, if you measure test scores in two classes, you know the precise means for both groups and can obviously state with no uncertainty which one has a higher mean or a lower mean. You’re not trying to infer properties about a larger population based on your samples.
However, if you want to draw inferences about a population, there are many more issues you need to address. We’re now moving into inferential statistics. Drawing inferences about a population is particularly important in science where we want to apply the results to a larger population, not just the specific sample in the study. For example, if we’re testing a new medication, we don’t want to know that it works only for the small, select experimental group. We want to infer that it will be effective for a larger population. We want to generalize the sample results to people outside the sample having characteristics same as the experimental group.
Inferential statistics takes data from a sample and makes inferences about the larger population from which the sample was drawn. Con- sequently, we need to have confidence that our sample accurately reflects the population. This requirement affects our process. At a broad level, we must do the following:
Define the population we are studying.
Draw(select) a representative sample from that population.
Use analyses that incorporate the sampling error.
We don’t get to pick a convenient group. Instead, random sampling allows us to have confidence that the sample represents the population. This process is a primary method for obtaining samples that reflects the population on average. Random sampling produces statistics, such as the mean, that do not tend to be too high or too low. Using a random sample, we can generalize from the sample to the broader population.
While samples are much more practical and less expensive to work with, there are tradeoffs. Typically, we learn about the population by drawing a relatively small sample from it. We are a very long way off from measuring all people or objects in that population. Consequently, when you estimate the properties of a population from a sample, the sample statistics are unlikely to equal the actual population value exactly. For instance, your sample mean is unlikely to equal the population mean. The difference between the sample statistic and the population value is the sampling error.
You gain tremendous benefits by working with a random sample drawn from a population. In most cases, it is simply impossible to measure the entire population to understand its properties. The alternative is to gather a random sample and then use hypothesis testing to analyze the sample data. However, a crucial point to remember is that hypothesis tests make assumptions about the data collection process. For instance, these tests assume that the data were collected using a method that tends to produce representative samples. After all, if the sample isn’t similar to the population, you won’t be able to use the sample to draw conclusions about the population.
Thanks for Reading :)



Comments