Introduction to Fixed and Random Effects Model

Fixed and Random Effects Models
Example: School Aptitude Test Scores
- Fixed Effects Approach
- Random Effects Approach
ANOVA Estimation (Method of Moments)

In statistical modeling, understanding the distinction between fixed effects and random effects is crucial for properly analyzing data with hierarchical or grouped structures. These two approaches handle group-level variation in fundamentally different ways.

Fixed and Random Effects Models

Fixed effects models treat group-specific intercepts (or slopes) as parameters to be estimated. Each group gets its own fixed parameter, and we make inferences only about the specific groups observed in our data. Random effects models treat group-specific effects as random variables drawn from a probability distribution. Instead of estimating individual group parameters, we estimate the parameters of the distribution from which group effects are drawn. The following example illustrates the difference between these two approaches.

Example: School Aptitude Test Scores

Consider analyzing standardized aptitude test scores across multiple schools to understand both overall performance and school-specific effects. The response variable is the aptitude test score for each student, and the grouping variable is the school. We use $y_{ij}$ to denote the aptitude test score for student $j \in \{ 1, 2, ..., n_i \}$ in school $i \in \{1, 2, ..., I\}$. We use $\mu$ to denote the overall mean aptitude test score across all students, and $\beta_i$ to denote the effect of school $i$ on the test score.

Fixed Effects Approach

In the fixed effects model, each $\beta_i$ is treated as a fixed parameter to be estimated:

\[\begin{equation} \label{eq:linear_model} y_{ij} = \mu + \beta_i + \epsilon_{ij}, \end{equation}\]

where $\beta_i$ is the fixed school effect with identifiability constraint $\sum_{i=1}^I \beta_i = 0$, and $\epsilon_{ij} \sim \mathcal{N}(0, \sigma^2)$ is the individual-level error.

We estimate $I$ separate school parameters $\beta_i$ using the method of least squares. Each school has its own fixed effect $\beta_i$ representing how much that specific school’s average differs from the overall mean. We can only make inferences about the specific schools in our sample. In other words, we have $\begin{equation} y_{ij} \sim \mathcal{N}(\mu + \beta_i, \sigma^2). \end{equation}$

Random Effects Approach

In the random effects model, school effects are treated as random variables, allowing for the possibility that the true effects vary across schools. The model has the same form as the fixed effects model \eqref{eq:linear_model}. However, instead of considering $\beta_i$ as a fixed parameter, we treat it as the random deviation from the overall mean, i.e., $\beta_i \sim \mathcal{N}(0, \sigma_{\beta}^2)$. $\sigma_{\beta}^2$ can be considered as the between-group variance, capturing the variability of school effects. We assume that the random effects are independent of the individual-level errors. In other words, we have $\begin{align} y_{ij}|\beta_i &\sim \mathcal{N}(\mu + \beta_i, \sigma^2) \\ y_{ij} &\sim \mathcal{N}(\mu, \sigma_{\beta}^2 + \sigma^2). \end{align}$

When estimating the random effects model, we are interested in the distribution of the random effects $\beta_i$. Specifically, we want to estimate $\mu$, $\sigma^2$, and $\sigma_{\beta}^2$. This is typically done using Bayesian methods or restricted maximum likelihood (REML) estimation.

Proposition Consider the random effects model in the above example. For any two subjects $j$ and $j^\prime$ ($j \neq j^\prime$) in the same group $i$, we have $\begin{equation} \text{Corr}(y_{ij}, y_{ij^\prime}) = \frac{\sigma_{\beta}^2}{\sigma_{\beta}^2 + \sigma^2}. \end{equation}$

Proof.

The covariance between $y_{ij}$ and $y_{ij^\prime}$ is given by

\[\begin{align*} \text{Cov}(y_{ij}, y_{ij^\prime}) &= \text{Cov}(\mu + \beta_i + \epsilon_{ij}, \mu + \beta_i + \epsilon_{ij^\prime}) \\ &= \text{Var}(\beta_i) + \text{Cov}(\epsilon_{ij}, \epsilon_{ij^\prime}) + \text{Cov}(\beta_i, \epsilon_{ij^\prime}) + \text{Cov}(\epsilon_{ij}, \beta_i) \\ &= \sigma_{\beta}^2 + 0 + 0 + 0 \\ &= \sigma_{\beta}^2. \end{align*}\]

The variances are given by

\[\begin{align*} \text{Var}(y_{ij}) &= \text{Var}(\mu + \beta_i + \epsilon_{ij}) \\ &= \text{Var}(\beta_i) + \text{Var}(\epsilon_{ij}) + 2 \text{Cov}(\beta_i, \epsilon_{ij}) \\ &= \sigma_{\beta}^2 + \sigma^2 + 0 \\ &= \sigma_{\beta}^2 + \sigma^2. \end{align*}\]

Thus, the correlation is

\[\begin{align*} \text{Corr}(y_{ij}, y_{ij^\prime}) &= \frac{\text{Cov}(y_{ij}, y_{ij^\prime})}{\sqrt{\text{Var}(y_{ij}) \text{Var}(y_{ij^\prime})}} \\ &= \frac{\sigma_{\beta}^2}{\sqrt{(\sigma_{\beta}^2 + \sigma^2)(\sigma_{\beta}^2 + \sigma^2)}} \\ &= \frac{\sigma_{\beta}^2}{\sigma_{\beta}^2 + \sigma^2}. \end{align*}\]

The proof is complete. $\square$

Remark We usually call this the intraclass correlation coefficient (ICC). ICC is influenced by both the variability of the random effects and the residual error. When the group random effects variance is much larger than the individual-level variance, i.e., $\sigma_{\beta} \gg \sigma$, the ICC approaches 1, indicating that outcomes within the same group are highly correlated. This is a fundamental difference than the fixed effects model, where all outcomes are assumed to be independent.

ANOVA Estimation (Method of Moments)

A classical approach to estimating variance components in random effects models is through analysis of variance (ANOVA) using the method of moments. This method equates sample moments to their theoretical expectations.

Consider the balanced case where each school has the same number of students, $n_i = n$ for all $i$. The ANOVA decomposition partitions the total sum of squares into between-group and within-group components:

\[\text{SST} = \text{SSB} + \text{SSW},\]

where:

$\text{SST} = \sum_{i=1}^I \sum_{j=1}^n (y_{ij} - \bar{y}_{..})^2$ (total sum of squares)
$\text{SSB} = n \sum_{i=1}^I (\bar{y}_{i.} - \bar{y}_{..})^2$ (between-group sum of squares)
$\text{SSW} = \sum_{i=1}^I \sum_{j=1}^n (y_{ij} - \bar{y}_{i.})^2$ (within-group sum of squares)

Here, $\bar{y}_{i.} = \frac{1}{n} \sum_{j=1}^n y_{ij}$ and $\bar{y}_{..} = \frac{1}{In} \sum_{i=1}^I \sum_{j=1}^n y_{ij}$. We define the mean squares as $\text{MSB} = \frac{\text{SSB}}{I-1}$ and $\text{MSW} = \frac{\text{SSW}}{I(n-1)}$.

Proposition Under the random effects model, the expected MSB and MSW have the following form:

\[\begin{align} \mathbb{E}[\text{MSW}] &= \sigma^2, \\ \mathbb{E}[\text{MSB}] &= \sigma^2 + n\sigma_{\beta}^2. \end{align}\]

Proof.

We can see that $\bar{y}_{i.} = \frac{1}{n} \sum_{j=1}^n y_{ij} = \frac{1}{n} \sum_{j=1}^n (\mu + \beta_i + \epsilon_{ij}) = \mu + \beta_i + \frac{1}{n} \sum_{j=1}^n \epsilon_{ij} = \mu + \beta_i + \bar{\epsilon}_{i.}$.

For the within-group sum of squares, we have

\[\begin{align*} \mathbb{E}[\text{SSW}] &= \sum_{i=1}^I \sum_{j=1}^n \mathbb{E}[(y_{ij} - \bar{y}_{i.})^2] \\ &= \sum_{i=1}^I \sum_{j=1}^n \mathbb{E}[(\epsilon_{ij} - \bar{\epsilon}_{i.})^2] \\ &= n I \mathbb{E}[(\epsilon_{11} - \bar{\epsilon}_{1.})^2] \\ &= n I \mathbb{E}\left[\left((1- \frac{1}{n})\epsilon_{11} - \frac{1}{n} \sum_{j=2}^n \epsilon_{1j}\right)^2\right] \\ &= n I \left((1-\frac{1}{n})^2 \sigma^2 + (n-1) \left(\frac{1}{n}\right)^2 \sigma^2\right) \\ &= n I \left( \frac{\left(n-1\right)^2}{n^2} + \frac{n-1}{n^2} \right) \sigma^2 \\ &= (n-1) I \sigma^2. \end{align*}\]

We can see that $\bar{y}_{..} = \frac{1}{In} \sum_{i=1}^I \sum_{j=1}^n y_{ij} = \mu + \bar{\beta} + \bar{\epsilon}_{..}$.

Similarly, for the between-group sum of squares, we have $\bar{y}_{..} = \frac{1}{In} \sum_{i=1}^I \sum_{j=1}^n y_{ij} = \mu + \bar{\beta} + \bar{\epsilon}_{..}$, where $\bar{\beta} = \frac{1}{I} \sum_{i=1}^I \beta_i$ and $\bar{\epsilon}_{..} = \frac{1}{nI} \sum_{i=1}^I \sum_{j=1}^n \epsilon_{ij}$.

We have

\[\begin{align*} \mathbb{E}[\text{SSB}] &= n \sum_{i=1}^I \mathbb{E}[(\bar{y}_{i.} - \bar{y}_{..})^2] \\ &= n \sum_{i=1}^I \mathbb{E}[(\mu + \beta_i + \bar{\epsilon}_{i.}) - (\mu + \bar{\beta} + \bar{\epsilon}_{..})]^2 \\ &= n \sum_{i=1}^I \mathbb{E}[(\beta_i - \bar{\beta}) + (\bar{\epsilon}_{i.} - \bar{\epsilon}_{..})]^2 \\ &= n \sum_{i=1}^I \left[\mathbb{E}[(\beta_i - \bar{\beta})^2] + \mathbb{E}[(\bar{\epsilon}_{i.} - \bar{\epsilon}_{..})^2] + 2\mathbb{E}[(\beta_i - \bar{\beta})(\bar{\epsilon}_{i.} - \bar{\epsilon}_{..})]\right] \\ &= n \sum_{i=1}^I \left[\mathbb{E}[(\beta_i - \bar{\beta})^2] + \mathbb{E}[(\bar{\epsilon}_{i.} - \bar{\epsilon}_{..})^2]\right] \\ &= n \sum_{i=1}^I \left[\frac{I-1}{I}\sigma_{\beta}^2 + \frac{I-1}{nI}\sigma^2\right] \\ &= n I \cdot \frac{I-1}{I}\sigma_{\beta}^2 + n I \cdot \frac{I-1}{nI}\sigma^2 \\ &= (I-1)\left(n\sigma_{\beta}^2 + \sigma^2\right). \end{align*}\]

Therefore, the mean squares are given by

\[\begin{align*} \mathbb{E}[\text{MSW}] &= \frac{\mathbb{E}[\text{SSW}]}{I(n-1)} = \frac{(n-1)I\sigma^2}{I(n-1)} = \sigma^2, \\ \mathbb{E}[\text{MSB}] &= \frac{\mathbb{E}[\text{SSB}]}{I-1} = \frac{(I-1)n\sigma_{\beta}^2 + (I-1)\sigma^2}{I-1} = n\sigma_{\beta}^2 + \sigma^2. \end{align*}\]

The proof is complete. $\square$

Method of Moments Estimators Equating sample mean squares to their expectations yields: $\begin{align} \label{eq:anova_mom} \hat{\sigma}^2 &= \text{MSW} \\ \hat{\sigma}_{\beta}^2 &= \frac{\text{MSB} - \text{MSW}}{n} \end{align}$

Note that $\hat{\sigma}_{\beta}^2$ can be negative when MSB < MSW, which is theoretically impossible since variances are non-negative. In practice, negative estimates are typically set to zero or handled using alternative methods like REML.

References

Stroup, W. W., Ptukhina, M., & Garai, J. (2024). Generalized Linear Mixed Models: Modern Concepts, Methods and Applications (2nd ed.). Chapman and Hall/CRC. ISBN 9781498755566