Retired course
This course has been retired and is no longer supported.
About this lesson
Exercise files
Download this lesson’s related exercise files.
Inferential Statistics.docx63.2 KB Inferential Statistics - Solution.docx
63.3 KB
Quick reference
Inferential Statistics
Inferential statistics relies on the statistical analysis of a subset or sample of an entire population of occurrences to draw conclusions about the entire population. A key to successful inferential statistics is the selection of the sample.
When to use
Inferential statistics are used when the data from an entire population of occurrences or iterations of a product or process are not readily available. This could be because of a long time that the product or process has been in use, or it could be because of the access and availability of the product or process is limited.
Instructions
Inferential statistics is a branch of statistical analysis that relies on using the statistical analysis of a subset or sample from a data population to draw inferences about the statistical measures that are applicable to the entire population. In many cases, the entire data population is not available for measurement. This is particularly true for products or processes that have been in use for a long time period. The earlier iterations of the product or process are either no longer in existence or are out of the control of the product or process manager and therefore cannot be measured as part of the population.
Contrasting descriptive statistics with inferential statistics, there are a few obvious differences. Descriptive statistics analyze a set of data to provide insight into the real world business processes associated with that data. Inferential statistics analyze a sample set of data to provide insights into the larger data population from which the sample was drawn. Descriptive statistics are a mathematical analysis of the existing data. Inferential statistics use the descriptive statistics from the sample data and infer population statistics that will fall within a certain range.
Calculating descriptive statistics for the sample data will provide insight into the statistics applicable to the full population. Terminology that will be used in the hypothesis test discussions will differentiate at times between sample statistics and population statistics.
|
Size (# of points) |
Mean |
Standard Deviation |
Population |
N |
μ |
σ |
Sample |
n |
x-bar |
s |
While it is clear that precise statistical values are only available for the actual data in the subset or sample; if that sample fairly represents the entire population, then those values are excellent surrogates for the statistical measures of the entire population. Therefore, it is imperative that the sampling approach which is used to gather the sample data is one that will fairly represent the entire population. There are several principles that must be followed for this to occur. The sample must be:
- Representative: Sample accounts for changes in the process due to fluctuations in the process variables.
- Sufficient: The sample is large enough so that any patterns in the data are likely to be present in the population.
- Contextual: Soft data is collected to indicate what else is happening in the process.
- Reliable: Data collection is repeatable, reproducible, does not influence the sampling.
- Random: Every member of the population has an equal opportunity of being selected.
Data collection is often relatively easy when there is already a large body of data available from the population. In that case existing data is used, rather than collecting new data. However, the sampling approach must reflect the principles described above. Another key question is the sample size. This will be addressed in a later lesson on confidence intervals. If new data is needed, a measurement systems analysis should be done to ensure the data can be trusted.
Hints & tips
- If all the data is available, use it. Don’t rely on inferential statistics.
- Carefully consider your sampling plan to ensure it is representative, sufficient, reliable, random, and has appropriate contextual information. Your existing data may not cover all of these. If so, it is best to collect additional data rather than ignoring one or two of the data sample characteristics.
Lesson notes are only available for subscribers.
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.