Sample: What It Means in Statistics, Types, and Examples (2024)

What Is a Sample?

The term sample refers to a smaller, manageable version of a larger group. It is a subset containing the characteristics of a larger population. Samples are used in statistical testing when population sizes are too large to include all possible members or observations. A sample should represent the population as a whole and not reflect any bias toward a specific attribute. There are several sampling techniques used by researchers and statisticians, each with benefits and drawbacks.

Key Takeaways

A sample is used in statistics as an analytic subset of a larger population.
Using samples allows researchers to conduct timely their studies with more manageable data.
Randomly drawn samples do not have much bias if they are large enough, but achieving such a sample may be expensive and time-consuming.
In simple random sampling, every entity in the population is identical, while stratified random sampling divides the overall population into smaller groups.

Understanding Samples

A population is the total number of observations (i.e., individuals, animals, items, data, etc.) contained in a given group or context. A sample is a portion, part, or fraction of the whole group, and acts as a subset of that population. Samples are used in a variety of settings where research is conducted. Scientists, marketers, government agencies, economists, and research groups are among those who use samples for their studies and measurements.

Using whole populations for research comes with challenges. Researchers may have problems gaining access to entire populations. And, because of the nature of some studies, researchers may have difficulties getting the results they need in a timely fashion. This is why samples are used. Using a smaller group to represent the entire population can still produce valid results while reducing time and resources.

Samples must resemble the broader population to make accurate inferences or predictions. All the participants in the sample should share the same characteristics and qualities. So, if the study is about male college freshmen, the sample should be a small percentage of males that fit this description. Similarly, if a research group conducts a study on the sleep patterns of single women over 50, the sample should only include women within this demographic.

Special Considerations

Consider a team of academic researchers who want to know how many students studied for less than 40 hours for the CFA exam and still passed. Since more than 200,000 people take the exam globally each year, reaching out to every exam participant would burn time and resources.

In fact, by the time the data from the population is collected and analyzed, a couple of years would have passed, making the analysis worthless since a new population would have emerged. What the researchers can do instead is take a representative population and get data from this sample.

To achieve an unbiased sample, the selection has to be random so everyone from the population has an equal and likely chance of being added to the sample group. This is similar to a lottery draw and is the basis for simple random sampling.

Sampling Methods

Sampling methods refer to the way samples are chosen from the general population. Researchers can use one of two sampling methods to conduct their studies:

Probability Sampling: There is no deliberate choice in probability sampling. That's why it's also referred to as random sampling. Because there is no bias involved, probability sampling can be time-consuming and, at times, costly.
Non-Probability Sampling: Researchers who use this sampling method deliberately choose their samples. This makes it a non-random sampling method. Since it isn't random, only a certain portion of the population has a chance to participate in the study. Samples are chosen based on certain factors, including location or convenience.

Types of Sampling

Now that you know the methods of sampling, it's important to understand the different types of sampling that statisticians and researchers can use. We've highlighted just a few kinds of sampling below.

Simple Random Sampling

Simple random sampling is ideal if every entity in the population is identical. If the researchers don’t care whether their sample subjects are all male or all female or a combination of both sexes in some form, simple random sampling may be a good selection technique.

Let's say 200,000 test-takers sat for the CFA exam in 2021, out of which 40% were women and 60% were men. The random sample drawn from the population should, therefore, have 400 women and 600 men for a total of 1,000 test-takers.

Systematic Sampling

Systematic sampling is a form of probability sampling. Similar to simple random sampling, it involves choosing random samples within a fixed periodic interval. Researchers calculate the interval by dividing the total population by the required sample size.

Unlike simple random sampling, systematic sampling is more efficient when it comes to time and cost. There is also a lower risk of data being manipulated.

This type of sampling is best used when:

There is some order in the population
When the population is large and known, especially when time and resources are limited
When the sample is evenly spread across the population

Stratified Random Sampling

But what about cases where knowing the ratio of men to women who passed a test after studying for less than 40 hours is important? Here, a stratified random sample would be preferable to a simple random sample.

Strata (Age)	Number of People in Population	Number to Be Included in Sample
20-24	30,000	150
25-29	70,000	350
30-34	40,000	200
35-39	30,000	150
40-44	20,000	100
>44	10,000	50
Total	200,000	1,000

Cluster Sampling

Cluster sampling is a form of random sampling. Clusters are defined as different subsets of the larger population. Individual samples within the cluster have similar characteristics. Cluster sampling is commonly used when there are large populations that are spread out, making it expensive and time-consuming to study each subject.

There are a few steps to cluster sampling:

Understand and identify the population that is being studied.
Create the cluster. This means dividing the entire population into groups and choosing random samples from those groups to study.
Select the sample from the clusters.
Researchers conduct their study by interviewing the samples. Once this is done, data is collected and analyzed.

As noted above, cluster sampling can save time and money. But, there are certain disadvantages to using this type of sampling. For instance, researchers may be biased when they choose their clusters and samples. As such, the samples may not accurately represent the population at large.

Examples of Samples

In 2022, the population of the world was nearly 7.95 billion, out of which 49.7% were female and 50% were male. The total number of people in any given country can also be a population size. The total number of students in a city can be taken as a population, and the total number of dogs in a city is also a population size. Samples can be taken from these populations for research purposes.

Following our CFA exam example, the researchers could take a sample of 1,000 CFA participants from the total 200,000 test-takers—the population—and run the required data on this number. The mean of this sample would be taken to estimate the average of CFA exam takers who passed even though they only studied for less than 40 hours.

The sample group taken should not be biased. This means that if the sample mean of the 1,000 CFA exam participants is 50, the population mean of the 200,000 test takers should also be approximately 50.

Why Do Analysts Use Samples Instead of Measuring the Population?

Often, a population is too large or extensive in order to measure every member and measuring each member would be expensive and time-consuming. A sample allows for inferences to be made about the population using statistical methods.

What Is a Simple Random Sample?

This sampling method uses respondents or data points that are randomly selected from the larger population. With a large enough sample size, a random sample removes bias.

Why Do Random Samples Allow for Inference?

The laws of statistics imply that accurate measurements and assessments can be made about a population by using a sample. Analysis of variance (ANOVA), linear regression, and more advanced modeling techniques are valid because of the law of large numbers and the central limit theorem.

How Large of a Sample Do You Need?

This will depend on the size of the population and the type of analysis you'd like to do (e.g., what confidence intervals you are using). Power analysis is a technique for mathematically evaluating the smallest sample size needed based on your needs. Another rule of thumb is that your sample should be large enough, but no more than 10% as large as the population.

The Bottom Line

Sampling can help us understand the nuances of large populations. It is a cost-effective way for researchers to study them while saving time. Because it can be difficult to study large groups, marketers, scientists, governments, and other researchers use smaller subsets—known as samples—to analyze and make important decisions.