Multilevel models

# Multilevel models
### 02.28.22

---

## [Click here for PDF of slides](15-multilevel-models.pdf)

---

## Announcements

- HW 03 **due Tue, Mar 01 at 11:59pm**

- Raphaël's office hours this week will be **Tue, 1:30 - 3:30pm**

---

## Learning goals

- Recognize a potential for correlation in a data set
- Identify observational units at varying levels 
- Understand issues correlated data may cause in modeling
- Understand how random effects models can be used to take correlation into account 
- Use EDA for multilevel data

---

## Correlated observations

---

## Multilevel data

- We can think of correlated data as a multilevel structure

- Population elements are aggregated into groups
  - There are observational units and measurements at each level
  
--

- For now we will focus on data with two levels:

- **Level one**: Most basic level of observation 
  - **Level two**: Groups formed from aggregated level-one observations

---

## Two types of effects

- **Fixed effects**: Effects that are of interest in the study
  - Can think of these as effects whose interpretations would be included in a write up of the study

- **Random effects**: Effects we're not interested in studying but whose variability we want to understand 
  - Can think of these as effects whose interpretations would not necessarily be included in a write up of the study

---

## Practice

.question[
Radon is a carcinogen – a naturally occurring radioactive gas whose decay products are also radioactive – known to cause lung cancer in high concentrations. The EPA sampled more than 80,000 homes across the U.S.  Each house came from a randomly selected county and measurements were made on each level of each home.  Uranium measurements at the county level were included to improve the radon estimates.

1. What is the most basic level of observation (Level One)? 
2. What are the group units (Level Two, Level Three, etc...)
2. What is the response variable?
3. Describe the within-group variation. 
4. What are the fixed effects? What are the random effects?
]

---

## Teratogen and rat pups

---

## Data: Teratogen and rat pups

Today's data are simulated results of an experiment with 24 dams (mother rats) randomly divided into four groups that received different doses of teratogen, a substance that could potentially cause harm to developing fetuses. The four groups are

- High dose (3 mg)
  - Medium dose (2 mg)
  - Low dose (1 mg)
  - No dose (Control)

Each dam produced 10 rat pups and the presence of a deformity was noted.

**Goal**: Understand the association between teratogen exposure and the probability a pup is born with a deformity.

---

## Scenario 1: No dose effect

Assume dose has <u>no</u> effect on, `$p$`, the probability of a pup born with a deformity.

- **Scenario 1a.**: `$p = 0.5$` for each dam

- **Scenario 1b.**: `$p \sim Beta(0.5, 0.5)$` (expected value = 0.5)

---

---

## Scenario 2: Dose effect

---

## Scenario 2: Dose effect

Now we will consider the effect of the dose of teratogen on the probability of a pup born with a deformity. The 24 pups have been randomly divided into four groups:

- High dose (`dose = 3`)
  - Medium dose (`dose = 2`)
  - Low dose (`dose = 1`)
  - No dose (`dose = 0`)
  
--

We will assume the true relationship between `$p$` and dose is the following:

`$$\log\Big(\frac{p}{1-p}\Big) = -2 + 1.33 ~ dose$$`

---

## Scenario 2

**Scenario 2a.**

`$$p = \frac{e^{-2 + 1.33 ~ dose}}{1 + e^{-2 + 1.33 ~ dose}}$$`

**Scenario 2b.**

`$$p \sim Beta\Big(\frac{2p}{(1-p)}, 2\Big)$$`

On average, dams who receive dose `$x$` have the same probability of deformed pup as dams with dose `$x$` under Scenario 2a.

---

## Distributions under Scenario 2

---

## Summary statistics under Scenario 2

<table class="table" style="margin-left: auto; margin-right: auto;">
<caption>Summary statistics of Scenario 2 by dose.</caption>
 <thead>
<tr>
<th style="empty-cells: hide;border-bottom:hidden;" colspan="1"></th>
<th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="4"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Scenario 2a</div></th>
<th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="4"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Scenario 2b</div></th>
</tr>
  <tr>
   <th style="text-align:right;"> Dosage </th>
   <th style="text-align:right;"> Mean p </th>
   <th style="text-align:right;"> SD p </th>
   <th style="text-align:right;"> Mean Count </th>
   <th style="text-align:right;"> SD Count </th>
   <th style="text-align:right;"> Mean p </th>
   <th style="text-align:right;"> SD p </th>
   <th style="text-align:right;"> Mean Count </th>
   <th style="text-align:right;"> SD Count </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 0.119 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;width: 1cm; "> 1.333 </td>
   <td style="text-align:right;width: 1cm; "> 1.366 </td>
   <td style="text-align:right;"> 0.061 </td>
   <td style="text-align:right;"> 0.069 </td>
   <td style="text-align:right;width: 1cm; "> 0.500 </td>
   <td style="text-align:right;width: 1cm; "> 0.837 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0.339 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;width: 1cm; "> 3.167 </td>
   <td style="text-align:right;width: 1cm; "> 1.835 </td>
   <td style="text-align:right;"> 0.239 </td>
   <td style="text-align:right;"> 0.208 </td>
   <td style="text-align:right;width: 1cm; "> 3.500 </td>
   <td style="text-align:right;width: 1cm; "> 2.881 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 0.661 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;width: 1cm; "> 5.833 </td>
   <td style="text-align:right;width: 1cm; "> 1.472 </td>
   <td style="text-align:right;"> 0.615 </td>
   <td style="text-align:right;"> 0.195 </td>
   <td style="text-align:right;width: 1cm; "> 5.833 </td>
   <td style="text-align:right;width: 1cm; "> 1.941 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:right;"> 0.881 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;width: 1cm; "> 8.833 </td>
   <td style="text-align:right;width: 1cm; "> 1.169 </td>
   <td style="text-align:right;"> 0.872 </td>
   <td style="text-align:right;"> 0.079 </td>
   <td style="text-align:right;width: 1cm; "> 8.833 </td>
   <td style="text-align:right;width: 1cm; "> 1.169 </td>
  </tr>
</tbody>
</table>

---

.question[
1. In Scenario 2a, dams produced 4.79 deformed pups on average, with standard deviation 3.20. Scenario 2b saw an average of 4.67 with standard deviation 3.58. Why are comparisons by dose more meaningful than these overall comparisons?

2. We will use binomial and quasibinomial regression to model the relationship between dose and probability of pup born with a deformity. What can you say about the center and the width of the confidence intervals under Scenarios 2a and 2b?  
  - Which will be similar and why?  
  - Which will be different and how?
]

---

## Scenario 2: Estimated odds ratio

.midi[The estimated effect of dose and the 95% CI from the binomial and quasibinomial models are below:]

**Scenario 2a**

|               | Odds Ratio | 95% CI         |
|---------------|-----------------|----------------|
| Binomial      | 3.536          | (2.604, 4.958) |
| Quasibinomial | 3.536          | (2.512, 5.186) |

**Scenario 2b**

|               | Odds Ratio | 95% CI         |
|---------------|-----------------|----------------|
| Binomial      | 4.311          | (3.086, 6.271) |
| Quasibinomial | 4.311          | (2.735, 7.352) |

---

.question[
1. Describe how the quasibinomial analysis of Scenario 2b differs from the binomial analysis of the same simulated data. Do confidence intervals contain the true model parameters? Is this what you expected? Why?

2. Why are differences between quasibinomial and binomial models of Scenario 2a less noticeable than the differences in Scenario 2b?
]

---

## Summary

- The structure of the data set may imply correlation between observations.

- Correlated observations provide less information than independent observations; we need to account for this reduction in information.

- Failing to account for this reduction could result in underestimating standard error, thus resulting in overstating significance and the precision of the estimates.

- We showed how we can account for this by incorporating the dispersion parameter or a random effect.

---

## Multilevel models

---

## Data: Music performance anxiety

.midi[The data [`musicdata.csv`](data/musicdata.csv) come from the Sadler and Miller (2010) study of the emotional state of musicians before performances. The dataset contains information collected from 37 undergraduate music majors who completed the Positive Affect Negative Affect Schedule (PANAS), an instrument produces a measure of anxiety (negative affect) and a measure of happiness (positive affect). This analysis will focus on negative affect as a measure of performance anxiety.]

The primary variables we'll use are

- **`na`**: negative affect score on PANAS (the response variable)
- **`perform_type`**: type of performance (Solo, Large Ensemble, Small Ensemble)
- **`instrument`**: type of intstrument (Voice, Orchestral, Piano)

---

## Look at data

.small[
<table>
 <thead>
  <tr>
   <th style="text-align:right;"> id </th>
   <th style="text-align:right;"> diary </th>
   <th style="text-align:left;"> perform_type </th>
   <th style="text-align:right;"> na </th>
   <th style="text-align:left;"> gender </th>
   <th style="text-align:left;"> instrument </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Solo </td>
   <td style="text-align:right;"> 11 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:right;"> 19 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:right;"> 14 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 43 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Solo </td>
   <td style="text-align:right;"> 19 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 43 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Solo </td>
   <td style="text-align:right;"> 13 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 43 </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Small Ensemble </td>
   <td style="text-align:right;"> 19 </td>
   <td style="text-align:left;"> Female </td>
   <td style="text-align:left;"> voice </td>
  </tr>
</tbody>
</table>
]

- What are the Level One observations? Level Two observations?

- What are the Level One variables? Level Two variables?

---

## Univariate exploratory data analysis

**Level One variables**

Two ways to approach univariate EDA (visualizations and summary statistics) for Level One variables:

- Use individual observations (i.e., treat observations as independent)

- Use aggregated values for each Level Two observation

**Level Two variables**

- Use a data set that contains one row per Level Two observation

---

## Bivariate exploratory data analysis

**Goals**

- Explore general association between the predictor and response variable 
- Explore whether subjects at a given level of the predictor tend to have similar mean responses 
- Explore whether variation in response differs at different levels of a predictor

There are two ways to visualize these associations:

- One plot of individual observations (i.e., treat observations as independent)

- Separate plots of responses vs. predictor for each Level Two observation (lattice plots)

---

## Fitting the model

---

## Questions we want to answer

The goal is to understand variability in performance anxiety (`na`) based on performance-level and musician-level characteristics. Specifically:

- What is the association between performance type (large ensemble or not) and performance anxiety? Does the association differ based on instrument type (orchestral or not)?

<table>
 <thead>
  <tr>
   <th style="text-align:left;"> term </th>
   <th style="text-align:right;"> estimate </th>
   <th style="text-align:right;"> std.error </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> (Intercept) </td>
   <td style="text-align:right;"> 15.721 </td>
   <td style="text-align:right;"> 0.359 </td>
   <td style="text-align:right;"> 43.778 </td>
   <td style="text-align:right;"> 0.000 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> orchestra </td>
   <td style="text-align:right;"> 1.789 </td>
   <td style="text-align:right;"> 0.552 </td>
   <td style="text-align:right;"> 3.243 </td>
   <td style="text-align:right;"> 0.001 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> large_ensemble </td>
   <td style="text-align:right;"> -0.277 </td>
   <td style="text-align:right;"> 0.791 </td>
   <td style="text-align:right;"> -0.350 </td>
   <td style="text-align:right;"> 0.727 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> orchestra:large_ensemble </td>
   <td style="text-align:right;"> -1.709 </td>
   <td style="text-align:right;"> 1.062 </td>
   <td style="text-align:right;"> -1.609 </td>
   <td style="text-align:right;"> 0.108 </td>
  </tr>
</tbody>
</table>

---

## Other modeling approaches

1️⃣  Condense each musician's set of responses into a single outcome (e.g., mean max, last observation, etc.) and fit a linear model on these condensed observations

- Leaves few observations (37) to fit the model
- Ignoring a lot of information in the multiple observations for each musician

2️⃣ Fit a separate model for each musician understand the association between performance type (Level One models). Then fit a system of Level Two models to predict the fitted coefficients in the Level One model for each subject based on instrument type (Level Two model).

**Let's look at approach #2**

---

## Two-level model

We'll start with the Level One model to understand the association between performance type and performance anxiety for the `$i^{th}$` musician.

.eq[
`$$na_{ij} = a_i + b_i ~ LargeEnsemble_{ij} + \epsilon_i, \hspace{5mm} \epsilon_{ij} \sim N(0,\sigma^2)$$`
]

Why is it more meaningful to use performance type for the Level One model than instrument?

For now, estimate `$a_i$` and `$b_i$` using least-squares regression.

---

## Example Level One model

Below is partial data for observation #22

<table>
 <thead>
  <tr>
   <th style="text-align:right;"> id </th>
   <th style="text-align:right;"> diary </th>
   <th style="text-align:left;"> perform_type </th>
   <th style="text-align:left;"> instrument </th>
   <th style="text-align:right;"> na </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Solo </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 24 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 21 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 14 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 13 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 12 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 14 </td>
   <td style="text-align:left;"> Large Ensemble </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 19 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 22 </td>
   <td style="text-align:right;"> 15 </td>
   <td style="text-align:left;"> Solo </td>
   <td style="text-align:left;"> orchestral instrument </td>
   <td style="text-align:right;"> 25 </td>
  </tr>
</tbody>
</table>

---

## Level One model

```r
music %>%
  filter(id == 22) %>%
  lm(na ~ large_ensemble, data = .) %>%
  tidy() %>%
  kable(digits = 3)
```

**Repeat for all 37 musicians. See Part 3: Level One Models in `lecture-15.Rmd`.**

---

<div class="figure" style="text-align: center">
<img src="data:image/png;base64,#15-multilevel-models_files/figure-html/unnamed-chunk-14-1.png" alt="Recreated from BMLR Figure 8.9" width="70%" />
<p class="caption">Recreated from BMLR Figure 8.9</p>
</div>

**Now let's consider if there is an association between the estimated slopes, estimated intercepts, and the type of instrument**

---

## Level Two Model

The slope and intercept for the `$i^{th}$` musician can be modeled as

.eq[
`$$\begin{aligned}&a_i = \alpha_0 + \alpha_1 ~ Orchestra_i + u_i \\
&b_i = \beta_0 + \beta_1 ~ Orchestra_i + v_i\end{aligned}$$`
]

Note the response variable in the Level Two models are not observed outcomes but the (fitted) slope and intercept from each musician

**See Part 4: Level Two Models in `lecture-15.Rmd`.**

---

## Estimated coefficients by instrument

---

## Level Two model

**Model for intercepts**

**Model for slopes**

---

## Writing out the models

**Level One**

`$$\hat{na}_{ij}  = \hat{a}_i + \hat{b}_i ~ LargeEnsemble_{ij}$$`

for each musician.

**Level Two**

`$$\begin{aligned}&\hat{a}_i = 16.283 + 1.441 ~ Orchestra_i \\
&\hat{b}_i = -0.771 - 1.406 ~ Orchestra_i\end{aligned}$$`

---

## Composite model

.eq[
`$$\begin{aligned}\hat{na}_i &= 16.283 + 1.441 ~ Orchestra_i - 0.771 ~ LargeEnsemble_{ij} \\
&- 1.406 ~ Orchestra:LargeEnsemble_{ij}\end{aligned}$$`
]

(Note that we also have the error terms `$\epsilon_{ij}, u_i, v_i$` that we will discuss next class.)

- What is the predicted average performance anxiety before solos and small ensemble performances for vocalists and keyboardists? For those who place orchestral instruments?

- What is the predicted average performance anxiety before large ensemble performances for those who play orchestral instruments?

---

## Disadvantages to this approach

⚠️ Weighs each musician the same regardless of number of diary entries

⚠️  Drops subjects who have missing values for slope (7 individuals who didn't play a large ensemble performance)

⚠️  Does not share strength effectively across individuals

**We will use a unified approach that utilizes likelihood-based methods to address some of these drawbacks.**

---

## Acknowledgements

The content in the slides is from 
  - [BMLR: Chapter 7 - Correlated data](https://bookdown.org/roback/bookdown-BeyondMLR/ch-corrdata.html)
  - [BMLR: Chapter 8 - Introduction to Multilevel Models](https://bookdown.org/roback/bookdown-BeyondMLR/ch-multilevelintro.html)

- Sadler, Michael E., and Christopher J. Miller. 2010. “Performance Anxiety: A Longitudinal Study of the Roles of Personality and Experience in Musicians.” Social Psychological and Personality Science 1 (3): 280–87. http://dx.doi.org/10.1177/1948550610370492.