Creating an Hypothesis
Concept of Hypothesis Testing Free
The scientific method of analysis is to create an hypothesis, develop experiments that generate applicable data, then analyze that data to prove or disprove the hypothesis. This approach allows us to confidently answer inquiry questions with data. This lesson explains the concepts of hypotheses in problem solving.
Video time: 04m 55s
Hypothesis Test Process
Effective hypothesis testing is a disciplined process. From writing the process, to designing the study or experiments, and finally analyzing the data, there are proven best practices that should be applied. This lesson presents and explains the hypothesis testing process as used in Lean Six Sigma.
Video time: 05m 48s
Hypothesis Tests Free
There are many different statistical tests that can be used with data to analyze a hypothesis. Which test depends upon the nature of the hypothesis and the test data. This lesson provides a roadmap for selecting an appropriate hypothesis test. The key decision factors for making the hypothesis test selection are addressed and illustrated.
Video time: 07m 42s
Statistical Analysis of Data
Hypothesis testing relies on the principle of inferential statistics. A sample data set from a larger population of data is statistically analyzed. The result of the analysis of the sample data is used to infer a conclusion about the larger population of data. This lesson will discuss the concept and ways to compare the data sample and the data population.
Video time: 06m 08s
Since it is often impossible to analyze all the data items in a population of data, a sample is selected from the data population. But there is a chance that the sample may not perfectly represent the full population. Based upon an understanding of the data sample and population, a range or interval can be established around any sample statistic that represents the boundaries within which the population statistic exists. In this lesson, we learn how to determine the size of that range or confidence interval.
Video time: 06m 13s
Samples and Sample Selection
Hypothesis testing relies on the use of data samples. However, the power and value of the hypothesis test are based on the size of the sample and the means by which it was selected. In this lesson, we consider factors for selecting sample data points and we determine the size of the sample needed based on the desired accuracy of the answer.
Video time: 05m 55s
One of the most important statistical measures of a data set is the mean or average value. For inferential statistics to be valid, the mean of the sample should be approximately the same as the mean of the entire population of data. The Standard Error is the measure of how accurately the sample mean will approximate the population mean. In this lesson, we will determine how to calculate the standard error and how the sampling process can affect that error.
Video time: 05m 02s
Alpha and Beta Risk
The statistic created in a hypothesis test is only 100% accurate for the data in the sample from which the statistic was calculated. The application of the statistical value to the broader data population has some uncertainty. It is possible that the full population of data is different from the sample of data that was tested. This uncertainty gives rise to the Alpha and Beta risks discussed in this lesson.
Video time: 05m 57s
Significance and Power
The ability of a hypothesis test to provide insight into the characteristics of a data population is based on the sample of data selected and some statistical characteristics of the sample and the population. The relationship between these gives rise to two measures that can be made concerning the validity of the hypothesis test. These measures are Significance and Power and will be discussed in this lesson.
Video time: 05m 04s
The P Value
Inferential statistics relies on a statistical measure of goodness known as the "P" Value to determine whether to accept or reject the Null hypothesis. This P value is based upon the type of test conducted and the confidence interval and Alpha risk that are applied to the situation. This lesson explains the principle of the "P" value and its use in Lean Six Sigma projects.
Video time: 05m 25s
One of the most important criteria for selecting a hypothesis test is based upon whether the data being analyzed is normal on not normal. The normality question does not prove or disprove the hypothesis, rather it determines the type of statistical test that should be performed. This lesson reviews the concept of normality and how to determine it.
Video time: 06m 34s
Distributions and Discrete Data
Data is often displayed visually in distributions. Recognizing the type of display and the nature of the distribution can aid in the selection and analysis of a hypothesis test. This lesson will explain data distributions and review the typical distributions for discrete data.
Video time: 06m 29s
Continuous Data Distributions
There are many types of continuous data distributions. These are often associated with physical characteristics of the data or system being studied. The ability to recognize the type of distribution aid in the selection and analysis of a hypothesis test.
Video time: 04m 54s
A common investigation in Lean Six Sigma problem solving is to determine if two factors are correlated. This insight will often point to an underlying cause of the problem. This lesson explains how to do correlation analysis using both Excel and Minitab. It also includes a discussion of the Pearson correlation coefficient.
Video time: 06m 19s
Regression models are formulas that allow us to predict the performance of the system being analyzed. As a hypothesis test, we can determine whether the regression formula is able to predict the performance of the sample data set. This lesson defines the different types of regression analyses that will be discussed in later lessons and how to choose between the regression approaches.
Video time: 04m 30s
Simple Linear Regression
Simple linear regression analysis creates an equation that correlates two factors. This equation assists in understanding problems, and it can also be used to manage the problem or process going forward. This lesson shows how to calculate this line with the help of either Excel or Minitab.
Video time: 06m 24s
A statistical analysis or test creates a mathematical model to fit the data in the sample. The real-world data seldom precisely fits the model. The differences between the model and the actual data are known as residuals. This lesson explains how to read residual graphs and analyses.
Video time: 06m 58s
A regression formula can be used to predict how a system responds to various inputs. However, based on the nature of the data set, there will be some uncertainty to the accuracy of that prediction. This lesson shows how to determine that level of uncertainty in the regression model prediction.
Video time: 04m 34s
Multiple Linear Regression Free
Many times there are multiple factors that are influencing the response variable in a problem. Multiple regression determines the relationship between the response factor and multiple control factors. Like with simple linear regression, a formula is created that allows both analysis and prediction of the process and problem.
Video time: 05m 52s
Many of the problems encountered in Lean Six Sigma projects do not have the straight-line correlation effect that we discussed with simple linear regression or multiple regression analysis. The relationship is better modeled by an exponential curve, a parabola, or other non-linear relationship. This lesson will use Minitab to assist in determining the best model to predict performance.
Video time: 06m 05s
Box-Cox and Transformations
Data transformation can convert a Lean Six Sigma problem from a non-linear regression analysis into a linear regression which is often easier to understand and explain to the stakeholders. The most common transformation approach is the Box-Cox transformation. In this lesson, we demonstrate how this transformation works and discuss when to use it.
Video time: 04m 34s
Test of Proportions
The One-Sample and Two-Sample Test of Proportions are used with discrete data. These tests determine whether the percentage of a particular attribute being studied is similar to or different from the selected target value. These tests are illustrated using both Excel and Minitab.
Video time: 07m 39s
When the data is discrete data, but there are more than two samples to be compared, the Chi Square test is used. This test is quickly accomplished using Minitab. It can be done using Excel, but requires several intermediate steps. This lesson explains this test approach.
Video time: 07m 26s
The Analysis of Variance (ANOVA) test is a commonly used test in Lean Six Sigma projects. It allows the comparison of multiple data sets to determine whether there is a statistical difference in those data sets. The analysis can be easily done in both Excel and Minitab. This lesson addresses the basics of ANOVA.
Video time: 05m 19s
The One-Sample Sign Test and the One-Sample Wilcoxon Test accomplish the same purpose, but each has strengths and weaknesses. When the data is not normal, or it is not known that it definitely is normal, these tests can be used to determine if the data set statistics meets or exceeds a target value. The application of this test using Minitab is illustrated in this lesson.
Video time: 06m 10s
Mood’s Median, Kruskal-Wallis, Friedman Tests
These three tests are for multiple samples of non-normal data. Each test has its strengths and weaknesses. The appropriate test will depend upon what is known, or not known, about the data in the samples. The Minitab interface to accomplish each of these tests is similar. This lesson will explain the differences and show how to conduct the test and read the results.
Video time: 07m 31s
PMI, PMP, CAPM and PMBOK are registered marks of the Project Management Institute, Inc.