Subscriber only lesson.
Sign up to this course to view this lesson.
About this lesson
Chi Square Test
The Chi Square test is a hypothesis test that considers categories and counts of discrete data items to determine whether the categories are independent.
When to use
When the data that is being analyzed is discrete data and the actual data is counts of different categories, Chi Square test is the appropriate hypothesis test to determine if the categories are independent.
The Chi Square test is a commonly used test to determine the independence of categories within a data set. This test can be used to separate statistically significant dependent relationships from those that are independent. Chi Square can be used when working across multiple sample sets of data.
The data is normally organized in a table of counts. The columns are the count categories and the rows represent the different samples as shown in the table below.
- Excel has a function for conducting a Chi Square test.
- The data is first recorded in a table in the format shown above – this is the “Actuals” table.
- An “Expected” table is created by multiplying the Row Percentage times the Column Percentage times the grand total of the counts for each of the cells in the matrix.
- The total for each row and each column should be the same in both matrices, although the Actual matrix will have whole number counts in each cell and the Expected matrix will have a calculated value that is normally not a whole number in each cell.
- Use the CHISQ.TEST function and provide the range for each matrix.
- Excel provides a P value for independence.
- Minitab is able to calculate a Chi Square test.
- Stat > Tables > Chi Square Test for Association
- Select the data columns (You do not need to create the “Expected” table, Minitab will do that automatically.
Hints & tips
- If doing the analysis in Excel, be sure the totals for columns and rows are the same in both the Actual and Expected tables.
- Chi Square will tell you if at least some of the factors are independent, however if some are and some are not, it will not separate out which factors are independent and which are dependent. You will need to test that by reducing some of the columns in your table.
Lesson notes are only available for subscribers.