What Is a Stratified Random Sample?
A stratified random sample is a means of gathering information about collections of specific target audiences or demographics. These samples are meant to be representative only of the specific demographics being targeted, though a sampled demographic may be representative of that entire demographic within the population.
The data is the result of a type of sampling procedure that is both stratified and probabilistic. Stratified random samples also are known as proportional random samples or quota random samples. To understand what this means, it first is important to break down the terms involved.
What is a sample?
A sample is a mini-representation of a larger population. Samples can be determined informally or formally. But samples that are systematically developed according to certain scientific methods are generally perceived as being more useful for making generalizations about the larger population.
What does stratified mean?
Stratified samples consist of homogeneous subgroups that are considered to be distinct in important ways. A collection of these homogeneous subgroups is referred to as strata. This method of sampling procedures enables the population to be dividing into homogeneous subgroups from which simple random samples may be selected.
Why is a stratified sample useful?
The aim of stratified random sampling is to select participants from different subgroups who are believed to have relevance to the research that will be conducted. For instance, the results of a study could be influenced by the subjects’ attributes, such as their ages, gender, work experience level, racial and ethnic group, economic situation, level of education attained, and so forth. A stratified sample is constructed so that these potentially influential characteristics can be reasonably assumed to reflect the pattern of these characteristics in the overall population.
In this way, the sample reflects the population from which it has been taken, but the sample cannot be said to be representative of the larger population. Remember, the selection of members of a stratified sample is not a random process. That said, once the strata have been established, simple random sampling is used to select the members of the samples for each stratum.
What does probabilistic mean?
A stratified random sample is probabilistic because every method used to select the sample population provides a reasonably reliable way of estimating how representative the sample population is to the larger population from which the sample was selected. In other words, a probabilistic sample permits a researcher to estimate the odds that the sample selected does or does not represent the larger population from which the sample was drawn.
Stratified random sampling methods often are used when there is interest in the differences between homogeneous subgroups and the larger sample population as a whole.
Let’s say that a population of business clients can be divided into three groups: Generation X, millennials, and baby boomers. Moreover, we have reason to believe that both the Gen Xers and the millennials are relatively smaller minorities of the overall business clientele. Gen Xers make up about 5 percent of the overall population of the clientele, and millennials make up about 10 percent of the clientele. A simple random sample of 100 members (n = 100) might generate 5 Gen Xers and 10 millennials if we used a sampling fraction of 10 percent.
It would be possible—just by chance—to get even fewer Gen Xers and fewer millennials than that in the sample. Stratification is likely to produce more representative outcomes. Say we want to have at least 25 people in each group. If we still take a sample of 100 (n = 100), then we can sample 25 Gen Xers, 25 millennials, and 50 baby boomers.
We know that 10 percent of the population is millennials (or about 100 of our clients). A random sample of 25 clients will give a within-stratum sampling fraction of 25/100 or 25 percent. We also know that 5 percent of the 50 clients who are not baby boomers are Gen Xers. This means that the within-stratum fraction will be 25/50 or 50 percent. So, 50 Gen Xers plus 100 millennials is a total of 150 of our client sample. Since the overall client population is 1,000, we subtract the Gen Xers and millennials (a total of 150 clients) which leaves 850 clients who are baby boomers.
The within-stratum sampling fraction for the baby boomers is 50/850 or about 5.88 percent.
Two things are evident:
- The three groups are more homogeneous within-group than across the whole population. This means there is less variance, which provides the opportunity for greater statistical precision.
- Since the sample has been stratified, there will be enough members from each group to be able to make meaningful subgroup inferences.
Stratified sampling might be preferred over simple random sampling when it is important to represent the overall population and to represent the key subgroups of the population, especially when the subgroups are quite small but distinguished in important ways. By using stratified sampling methods, a researcher can effectively assure that subgroups can be differentiated in the discussion of the research findings.