How to Obtain ANOVA Table with Statsmodels (original) (raw)

Last Updated : 23 Jul, 2025

Analysis of Variance (ANOVA) is a statistical method used to analyze the differences among group means in a sample. It is particularly useful for comparing three or more groups for statistical significance. In Python, the statsmodels library provides robust tools for performing ANOVA. This article will guide you through obtaining an ANOVA table using statsmodels, covering both one-way and two-way ANOVA, as well as repeated measures ANOVA.

Table of Content

Understanding ANOVA

**ANOVA is a powerful statistical method used to determine if there are any statistically significant differences between the means of two or more independent groups. It is widely used in various fields, including medicine, social sciences, and engineering. ANOVA can be one-way, two-way, or even multi-way, depending on the number of factors being analyzed. The key components of an ANOVA table include:

Performing ANOVA with Statsmodels

**1. One-Way ANOVA

One-way ANOVA is used when you have one independent variable and one dependent variable. Here's how to perform one-way ANOVA using statsmodels. **Step-by-Step Guide for evaluating one-way anova with statsmodels:

**1. Import Libraries and Data:

Python `

import statsmodels.api as sm from statsmodels.formula.api import ols import pandas as pd

data = sm.datasets.get_rdataset("PlantGrowth").data

`

**2. Fit the Model and Obtain the ANOVA Table:

Python `

model = ols('weight ~ C(group)', data=data).fit() anova_table = sm.stats.anova_lm(model, typ=2) print(anova_table)

`

Output:

        sum_sq    df         F   PR(>F)  

C(group) 3.76634 2.0 4.846088 0.01591
Residual 10.49209 27.0 NaN NaN

**2. Two-Way ANOVA

Two-way ANOVA is used when you have two independent variables. It helps in understanding if there is an interaction between the two factors on the dependent variable. Step-by-Step Guide for evaluating two-way anova with statsmodels:

**1. Import Libraries and Data:

Python `

import statsmodels.api as sm from statsmodels.formula.api import ols import pandas as pd

Example dataset

data = sm.datasets.get_rdataset("Moore", "carData").data data = data.rename(columns={"partner.status": "partner_status"})

`

**2. Fit the Model Obtain the ANOVA Table:

Python `

model = ols('conformity ~ C(fcategory) * C(partner_status)', data=data).fit() anova_table = sm.stats.anova_lm(model, typ=2) print(anova_table)

`

Output:

                                sum_sq    df          F    PR(>F)  

C(fcategory) 11.614700 2.0 0.276958 0.759564
C(partner_status) 212.213778 1.0 10.120692 0.002874
C(fcategory):C(partner_status) 175.488928 2.0 4.184623 0.022572
Residual 817.763961 39.0 NaN NaN

**Repeated Measures ANOVA

Repeated measures ANOVA is used when the same subjects are used for each treatment (i.e., repeated measurements). Step-by-Step Guide for evaluating **Repeated Measures ANOVA with statsmodels:

For this example, let's assume we have a dataset where we measured the reaction time of subjects under different conditions.

Python `

import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols from statsmodels.stats.anova import AnovaRM

data = { 'subject': ['S1', 'S1', 'S1', 'S2', 'S2', 'S2', 'S3', 'S3', 'S3', 'S4', 'S4', 'S4'], 'condition': ['A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'], 'reaction_time': [10, 12, 14, 8, 9, 11, 14, 15, 16, 7, 9, 10] }

df = pd.DataFrame(data)

Perform Repeated Measures ANOVA

aovrm = AnovaRM(df, 'reaction_time', 'subject', within=['condition']) res = aovrm.fit() print(res.summary())

`

Output:

            Anova  

======================================
F Value Num DF Den DF Pr > F

condition 40.5000 2.0000 6.0000 0.0003

Interpreting the ANOVA Table

The ANOVA table provides several key statistics:

Customizing the ANOVA Table

You can customize the ANOVA table by specifying different types of sums of squares. In statsmodels, you can use the type parameter to specify the type of ANOVA test:

table = sm.stats.anova_lm(moore_lm, typ=3) # Type III ANOVA
print(table)

**Conclusion

Obtaining an ANOVA table in statsmodels is a straightforward process. By following the steps outlined above, you can perform one-way, two-way, and repeated measures ANOVA, and interpret the results to understand the significance of your factors. This powerful method allows you to analyze complex datasets and draw meaningful conclusions about the relationships between variables.