The Chi-Square Test is a statistical technique used to assess whether there is a meaningful relationship between categorical variables. It compares observed data with expected data based on a hypothesis to check for independence or goodness of fit. Widely used in fields like business, science, and social research, the test helps analyze relationships in contingency tables. The calculated Chi-Square value is compared with a critical value to accept or reject the null hypothesis.
Chi-Square Test Definition:
The Chi-Square Test is a statistical method to determine whether there is a significant relationship between two categorical variables by comparing observed and expected frequencies.
Where:
The chi-square test helps answer questions such as:
It is mainly of two types:
Step-by-Step Process:
Example: A survey records the choice of drink by 100 people:
Solution:
Step 1: Calculate Expected Frequencies:
Similarly:
Step 2: Compute :
Step 3: Interpretation:
At 1 degree of freedom (df = (2-1)(2-1) =1) and significance level 0.05, the critical value is 3.841.
Since , we reject H_0.
Conclusion: There is a significant association between gender and drink choice.
An advanced method that incorporates prior beliefs into the chi-square analysis. It is useful when sample sizes are small or data is uncertain. Bayesian methods provide a probabilistic interpretation of the relationship between variables.
In business, A/B testing uses chi-square to compare two marketing strategies. Example:
Using the chi-square test, businesses can analyze if the difference in performance between Strategy A and B is statistically significant.
Example 1: In a survey, 200 people were asked about their preference for three different products: A, B, and C. The results are shown below:
Assuming all products are equally popular, test the hypothesis at a 5% significance level.
Solution:
Step 1 – Null Hypothesis H_0:
All products are equally preferred. So, expected frequency E for each = \frac{200}{3} \approx 66.67.
Step 2 – Calculate :
Step 3 – Degrees of Freedom (df):
df = n - 1 = 3 - 1 = 2
From the chi-square table, the critical value at 5% significance level and df = 2 is 5.991.
Since 7.01 > 5.991, we reject H0H_0.
Conclusion: The product preferences are not equally distributed.
Example 2: A company studied the relationship between customer gender and their product choice. The data is:
Test whether product choice is independent of gender at 5% significance level.
Solution:
Step 1 – Compute Expected Frequencies:
Step 2 – Calculate :
Step 3 – Degrees of Freedom (df):
df = (rows -1)(columns -1) = (2 -1)(2 -1) = 1
Critical chi-square value at 5% significance level and df =1 is 3.841.
Since 2.197 < 3.841, we accept H_0.
Conclusion: Product choice is independent of gender.
Example 3: A manufacturer claims that the proportion of defective items in a batch is 5%. From a sample of 200 items, 15 defective items were found. Test the manufacturer’s claim at a 5% significance level.
Solution:
Step 1 – Null Hypothesis :
The proportion of defective items is 5%.
Expected defective items:
Expected non-defective items:
E' = 200 - 10 = 190
Observed (O):
Step 2 – Calculate :
Step 3 – Degrees of Freedom (df):
df = number of categories – 1 = 2 – 1 = 1
Critical value at 5% significance level and df = 1 is 3.841.
Since 2.6316<3.8412.6316 < 3.841, we accept .
Conclusion: The manufacturer’s claim holds.
Example 4: A study records the following data regarding study method and exam success:
Is there an association between study method and exam success at a 1% significance level?
Solution:
Step 1 – Expected Frequencies:
Step 2 – Compute :
Step 3 – Degrees of Freedom (df):
df = (2 -1)(2 -1) =1
Critical chi-square value at 1% significance level and df =1 is 6.635.
Since 19.78 > 6.635, we reject .
Conclusion: Study method is significantly associated with exam success.
Example 5: A geneticist expects that a particular plant produces flowers in the ratio of Red:White:Pink as 9:3:4. In an experiment, out of 160 plants, the observed counts were Red – 90, White – 30, Pink – 40. Test the hypothesis at 5% significance level.
Solution:
Step 1 – Expected Frequencies:
Total parts = 9 + 3 + 4 = 16
Expected Red:
Expected White:
Expected Pink:
Step 2 – Compute :
Step 3 – Degrees of Freedom (df):
df = 3 -1 = 2
Critical value at 5% significance level and df = 2 is 5.991.
Since we accept .
Conclusion: The observed data perfectly matches the expected ratio.
Test whether car brand preference is independent of gender at 5% significance level.
Also Read:
(Session 2026 - 27)