Modeling two-level longitudinal data

# Modeling two-level longitudinal data
## Inference cont’d
### 03.23.22

---

## [Click here for PDF of slides](20-longitudinal-pt3.pdf)

---

## Announcements

- Mini-project 02
  - draft due **Thu, Mar 24 at noon** (peer review in lab)
  - presentations **Mon, Mar 28** in class
  - report due **Mon, Mar 28** at 11:59pm 
  
- Looking ahead
  - HW 04 due **Fri, Apr 1** at 11:59pm
  - Final project - optional draft due **Fri, Apr 15**, final report due **Wed, Apr 27**
  
- DataFest: April 1 - 3 in Penn Pavilion 
  - [Click here](https://www2.stat.duke.edu/datafest/signup.html) to register

---

## Learning goals

- Compare multilevel models

- Conduct inference for random effects

- Conduct inference for fixed effects

---

## Comparing multilevel models

---

## Data: Charter schools in MN

.midi[The data set [`charter-long.csv`](data/charter-long.csv) contains standardized test scores and demographic information for schools in Minneapolis, MN from 2008 to 2010. The data were collected by the Minnesota Department of Education. Understanding the effectiveness of charter schools is of particular interest, since they often incorporate unique methods of instruction and learning that differ from public schools.]

- **`MathAvgScore`**: Average MCA-II score for all 6th grade students in a school (response variable)
- **`urban`**: urban (1) or rural (0) location school location
- **`charter`**: charter school (1) or a non-charter public school (0)
- **`schPctfree`**: proportion of students who receive free or reduced lunches in a school (based on 2010 figures).
- **`year08`**: Years since 2008

---

## Data

```r
charter <- read_csv("data/charter-long.csv")
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> schoolName </th>
   <th style="text-align:right;"> year08 </th>
   <th style="text-align:right;"> urban </th>
   <th style="text-align:right;"> charter </th>
   <th style="text-align:right;"> schPctfree </th>
   <th style="text-align:right;"> MathAvgScore </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> RIPPLESIDE ELEMENTARY </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0.363 </td>
   <td style="text-align:right;"> 652.8 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> RIPPLESIDE ELEMENTARY </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0.363 </td>
   <td style="text-align:right;"> 656.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> RIPPLESIDE ELEMENTARY </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0.363 </td>
   <td style="text-align:right;"> 652.6 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> RICHARD ALLEN MATH&amp;SCIENCE ACADEMY </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0.545 </td>
   <td style="text-align:right;"> NA </td>
  </tr>
  <tr>
   <td style="text-align:left;"> RICHARD ALLEN MATH&amp;SCIENCE ACADEMY </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0.545 </td>
   <td style="text-align:right;"> NA </td>
  </tr>
  <tr>
   <td style="text-align:left;"> RICHARD ALLEN MATH&amp;SCIENCE ACADEMY </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0.545 </td>
   <td style="text-align:right;"> 631.2 </td>
  </tr>
</tbody>
</table>

---

## Compare two models

**Full model**

`$$\begin{aligned}Y_{ij} = [&\alpha_{0} + \alpha_{1}Charter_i + \alpha_{2}Urban_i+ \alpha_{3}schpctfree_i \\ &+ \beta_0Year08_{ij} + \beta_1Charter_i:Year08_{ij} + \beta_2Urban_i:Year_{ij}] \\&+ [u_i + v_iYear08_{ij} + \epsilon_{ij}]\end{aligned}$$`

where

`$$\left[ \begin{array}{c}
            u_{i} \\ v_{i}
          \end{array}  \right] \sim N \left( \left[
          \begin{array}{c}
            0 \\ 0
          \end{array} \right], \left[
          \begin{array}{cc}
            \sigma_{u}^{2} & \sigma_{uv} \\
            \sigma_{uv} & \sigma_{v}^{2}
          \end{array} \right] \right) \hspace{2mm} \text{ and }\hspace{2mm} \epsilon_{ij} \sim N(0, \sigma^2)$$`

**Estimated fixed effects**: `$\alpha_0, \alpha_1, \alpha_2, \alpha_3, \beta_0, \beta_1, \beta_2$`

**Variance components to estimate**: `$\sigma, \sigma_u, \sigma_v, \rho_{uv}$` (Note: `$\sigma_{uv} = \rho_{uv}\sigma_u\sigma_v$`)

---

## Compare two models

**Null Model (simplified variance structure)**

where

`$$u_i \sim N(0, \sigma^2_u) \hspace{2mm} \text{ and }\hspace{2mm} \epsilon_{ij} \sim N(0, \sigma^2)$$`

**Estimated fixed effects**: `$\alpha_0, \alpha_1, \alpha_2, \alpha_3, \beta_0, \beta_1, \beta_2$`

**Variance components to estimate**: `$\sigma, \sigma_u$`

---

## Full and reduced models

```r
full_model <- lmer(MathAvgScore ~ charter + urban + schPctfree + 
                     charter:year08  + urban:year08 + year08 + 
*                    (year08|schoolid), REML = T, data = charter)
```

```r
reduced_model <-  lmer(MathAvgScore ~ charter + urban + schPctfree + 
                     charter:year08  + urban:year08 + year08 +
*                      (1 | schoolid), REML = T, data = charter)
```

**Hypotheses**

`$$\begin{aligned}&H_0: \sigma_v = \rho_{uv} = 0\\
&H_a: \text{at least one of the parameters is not equal to 0}\end{aligned}$$`

*Note*: `$\rho_{uv} \neq 0 \Rightarrow \sigma_v \neq 0$`

---

## Bootstrapping

- **Bootstrapping** is from the phrase "pulling oneself up by one’s bootstraps"

- Accomplishing a difficult task without any outside help

- **Task**: conduct inference for model parameters (fixed and random effects) using only the sample data

- Two types of bootstrapping 
  - Parametric bootstrapping (use for LRT to test variance components)
  - Nonparametric bootstrapping (use to test fixed effects)

---

### Parametric bootstrapping for likelihood ratio test

1️⃣ &nbsp; Fit the null (reduced) model to obtain the fixed effects and variance components (*parametric* part).

2️⃣ &nbsp;  Use the estimated fixed effects and variance components to generate a new set of response values with the same sample size and associated covariates for each observation as the original data (*bootstrap* part).

3️⃣ &nbsp;  Fit the full and null models to the newly generated data.

4️⃣ &nbsp;  Compute the likelihood test statistic comparing the models from the previous step.

5️⃣ &nbsp;  Repeat steps 2 - 4 many times (~ 1000).

---

### Parametric bootstrapping for likelihood ratio test

6️⃣ &nbsp;   Create a histogram of the likelihood ratio statistics to get the distribution of likelihood ratio statistic under the null hypothesis.

7️⃣ &nbsp;   Get the p-value by calculating the proportion of bootstrapped test statistics greater than the observed statistic.

---

## LRT using `$\chi^2$` and parametric bootstrap

**Likelihood ratio test using `$\chi^2$` distribution**

<table>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> npar </th>
   <th style="text-align:right;"> AIC </th>
   <th style="text-align:right;"> BIC </th>
   <th style="text-align:right;"> logLik </th>
   <th style="text-align:right;"> deviance </th>
   <th style="text-align:right;"> Chisq </th>
   <th style="text-align:right;"> Df </th>
   <th style="text-align:right;"> Pr(&gt;Chisq) </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> reduced_model </td>
   <td style="text-align:right;"> 9 </td>
   <td style="text-align:right;"> 9952.992 </td>
   <td style="text-align:right;"> 10002.11 </td>
   <td style="text-align:right;"> -4967.496 </td>
   <td style="text-align:right;"> 9934.992 </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> NA </td>
   <td style="text-align:right;"> NA </td>
  </tr>
  <tr>
   <td style="text-align:left;"> full_model </td>
   <td style="text-align:right;"> 11 </td>
   <td style="text-align:right;"> 9953.793 </td>
   <td style="text-align:right;"> 10013.83 </td>
   <td style="text-align:right;"> -4965.897 </td>
   <td style="text-align:right;"> 9931.793 </td>
   <td style="text-align:right;"> 3.199 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 0.202 </td>
  </tr>
</tbody>
</table>

**Likelihood ratio test using parametric bootstrap**

| term <chr> | npar <dbl> | AIC <dbl> | BIC <dbl> | logLik <dbl> | deviance <dbl> | statistic <dbl> | df <dbl> | Pr_boot..Chisq. <dbl> |
|------------|-----------:|----------:|----------:|-------------:|---------------:|----------------:|---------:|----------------------:|
| m0       |          9 |  9952.992 |  10002.11 |    -4967.496 |       9934.992 |              NA |       NA |                    NA |
| mA         |         11 |  9953.793 |  10013.83 |    -4965.897 |       9931.793 |        3.199347 |        2 |                 0.144 |

---

## Inference for fixed effects

- The output for multilevel models do not contain p-values for the coefficients of fixed effects

- The exact distribution of the test statistic under the null hypothesis (no fixed effect) is unknown, because the exact degrees of freedom are unknown
  - Finding suitable approximations is an area of ongoing research

- We can use likelihood ratio test with an approximate `$\chi^2$` distribution to test these effects, since we're not testing on the boundary and fixed effects do not have limited ranges
  - Some research suggests the p-values are too low but approximations are generally pretty good 
  - Can also calculate the p-values using parametric bootstrap approach

---

.question[
Go to `lecture-20.Rmd` to test whether `schPctFree` should be included in the current model using likelihood ratio test with the `$\chi^2$` distribution and the parametric bootstrap.
]

---

## Methods to conduct inference for individual coefficients

- Use the t-value `$\big(\frac{estimate}{std.error}\big)$` in the model output 
  - General rule: Coefficients with |t-value| > 2 considered to be statistically significant, i.e., different from 0

- Calculate confidence intervals using **nonparametric bootstrapping**

---

## Nonparametric bootstrapping

1️⃣ &nbsp; Take a sample, with replacement, of size `$n$` (the size of the original data) (called *case resampling*).
    
2️⃣ &nbsp; Fit the model to obtain estimates of the coefficients.
    
3️⃣ &nbsp; Repeat steps 1 - 2 many times (~ 1000) to obtain the bootstrap distribution.
    
4️⃣ &nbsp; Get the coefficients for the 95% confidence interval by taking the middle 95% of the bootstrap distribution.

---

.question[
Go to `lecture-20.Rmd` for an example of nonparametric bootstrapping to estimate the 95% confidence interval for the model coefficients.
]

---

## Bootstrapped CI for coefficients

```r
confint(reduced_model, method = "boot", level = 0.95, oldNames = F) %>%
  kable(digits = 3)
```

<table>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> 2.5 % </th>
   <th style="text-align:right;"> 97.5 % </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> sd_(Intercept)|schoolid </td>
   <td style="text-align:right;"> 4.263 </td>
   <td style="text-align:right;"> 4.874 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> sigma </td>
   <td style="text-align:right;"> 2.863 </td>
   <td style="text-align:right;"> 3.101 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 659.392 </td>
   <td style="text-align:right;"> 661.446 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> charter </td>
   <td style="text-align:right;"> -4.496 </td>
   <td style="text-align:right;"> -1.561 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> urban </td>
   <td style="text-align:right;"> -2.028 </td>
   <td style="text-align:right;"> -0.194 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> schPctfree </td>
   <td style="text-align:right;"> -19.522 </td>
   <td style="text-align:right;"> -15.966 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> year08 </td>
   <td style="text-align:right;"> 1.220 </td>
   <td style="text-align:right;"> 1.783 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> charter:year08 </td>
   <td style="text-align:right;"> 0.371 </td>
   <td style="text-align:right;"> 1.663 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> urban:year08 </td>
   <td style="text-align:right;"> -0.927 </td>
   <td style="text-align:right;"> -0.157 </td>
  </tr>
</tbody>
</table>
]

---

## Summary: Model comparisons

**Methods to compare models with different fixed effects**

- Likelihood ratio tests based on `$\chi^2$` distribution 
- Likelihood ratio test based on parametric bootstrapped p-values
- AIC or BIC

**Methods to compare models with different variance components**

- Likelihood ratio test based on parametric bootstrapped p-values
- AIC or BIC

---

## Summary: Understanding the model

**Methods to understand individual coefficients**
- Likelihood ratio tests
- Bootstrap confidence intervals
- Pseudo `$R^2$` for variance components (if meaning is unchanged between models)

**Methods to understand data structure**

- Calculate intraclass correlation coefficient using unconditional means model

---

## Acknowledgements

The content in the slides from
 
- [BMLR: Chapter 9 - Two-level Longitudinal Data](https://bookdown.org/roback/bookdown-BeyondMLR/ch-lon.html)
  - Sections 9.1 - 9.6

- [Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation](https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=1353&context=pare) by Peter Boedeker

- *Applied longitudinal data analysis: Modeling change and event occurrence*. by J.D. Singer and J.B. Willett 
  - Online copy available through Duke library

---

## Acknowledgements

- *Extending the linear model with R*. by Julian Faraway
  - Online copy available through Duke library

- More mathematical details in [*Computational methods for mixed models*](https://cran.r-project.org/web/packages/lme4/vignettes/Theory.pdf) by Douglas Bates (a vignette for the **lme4** R package)