Using likelihoods to compare models

# Using likelihoods to compare models
### Prof. Maria Tackett
### 01.24.22

---

##[Click for PDF of slides](05-compare-models.pdf)

---

## Announcements

- Week 03 reading: 
  - [BMLR: Chapter 3 - Distribution Theory](https://bookdown.org/roback/bookdown-BeyondMLR/ch-distthry.html) (for reference)
  - [BMLR: Chapter 4 - Poisson Regression](https://bookdown.org/roback/bookdown-BeyondMLR/ch-poissonreg.html)

- Quiz 01 Tue, Jan 25 at 9am - Thu, Jan 27 at 3:30pm (start of lab)

---

## Quiz 01

- Open Jan 25 at 9am and must be completed by Thu, Jan 27 at 3:30pm
  - The quiz is not timed and will be administered in Gradescope.

- Covers 
  - Syllabus
  - BMLR Chapters 1 - 2
  - Jan 05 - Jan 24 lectures

- Fill in the blank, multiple choice, short answer questions

- Open book, open note, open internet (not crowd sourcing sites). You cannot discuss the quiz with anyone else. Please email me if you have questions.

---

## Learning goals

- Use likelihood to compare models

- Activity: Exploring the response variable for mini-project 01

---

## Recap

---

## Fouls in college basketball games

The data set [`04-refs.csv`](data/04-refs.csv) includes 30 randomly selected NCAA men's basketball games played in the 2009 - 2010 season.

We will focus on the variables `foul1`, `foul2`, and `foul3`, which indicate which team had a foul called them for the 1st, 2nd, and 3rd fouls, respectively. 
  - `H`: Foul was called on the home team 
  - `V`: Foul was called on the visiting team

We are focusing on the first three fouls for this analysis, but this could easily be extended to include all fouls in a game.

.footnote[.small[The dataset was derived from `basektball0910.csv` used in [BMLR Section 11.2](https://bookdown.org/roback/bookdown-BeyondMLR/ch-GLMM.html#cs:refs)]]

---

## Fouls in college basketball games

```r
refs <- read_csv("data/04-refs.csv")
refs %>% slice(1:5) %>% kable()
```

| game|     date|visitor |hometeam |foul1 |foul2 |foul3 |
|----:|--------:|:-------|:--------|:-----|:-----|:-----|
|  166| 20100126|CLEM    |BC       |V     |V     |V     |
|  224| 20100224|DEPAUL  |CIN      |H     |H     |V     |
|  317| 20100109|MARQET  |NOVA     |H     |H     |H     |
|  214| 20100228|MARQET  |SETON    |V     |V     |H     |
|  278| 20100128|SETON   |SFL      |H     |V     |V     |

We will treat the games as independent in this analysis.

---

## Likelihoods

A **likelihood** is a function that tells us how likely we are to observe our data for a given parameter value (or values).

**Model 1 (Unconditional Model)**

- `$p_H$`: probability of a foul being called on the home team

**Model 2 (Conditional Model)**

- `$p_{H|N}$`: Probability referees call foul on home team given there are equal numbers of fouls on the home and visiting teams
- `$p_{H|H Bias}$`: Probability referees call foul on home team given there are more prior fouls on the home team
- `$p_{H|V Bias}$`: Probability referees call foul on home team given there are more prior fouls on the visiting team

---

## Likelihoods

A **likelihood** is a function that tells us how likely we are to observe our data for a given parameter value (or values).

**Model 1 (Unconditional Model)**

`$$Lik(p_H) = p_H^{46}(1 - p_H)^{44}$$`

**Model 2 (Conditional Model)**

`$$\begin{aligned}Lik(p_{H| N}, p_{H|H Bias}, p_{H |V Bias}) &= [(p_{H| N})^{25}(1 - p_{H|N})^{23}(p_{H| H Bias})^8 \\ &(1 - p_{H| H Bias})^{12}(p_{H| V Bias})^{13}(1-p_{H|V Bias})^9]\end{aligned}$$`

---

## Maximum likelihood estimates

The **maximum likelihood estimate (MLE)** is the value between 0 and 1 where we are most likely to see the observed data.

- `$\hat{p}_H = 46/90 = 0.511$`

**Model 2 (Conditional Model)**

- `$\hat{p}_{H|N} = 25 / 48 = 0.521$`
- `$\hat{p}_{H|H Bias} = 8 /20 = 0.4$`
- `$\hat{p}_{H|V Bias} = 13/ 22 = 0.591$`

]

- What is the probability the referees call a foul on the home team, assuming foul calls within a game are independent? 
- Is there a tendency for the referees to call more fouls on the visiting team or home team? 
- Is there a tendency for referees to call a foul on the team that already has more fouls? 
]

---

### MLEs for Model 2

[Click here](05-model2-mle.pdf) for details on finding MLEs for Model2

---

## Model comparison

---

## Model comparisons

- Nested models

- Non-nested models

---

## Comparing nested models

---

## Nested Models

**Nested models**: Models such that the parameters of the reduced model are a subset of the parameters for a larger model

Example:

`$$\begin{aligned}&\text{Model A: }y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \epsilon\\
&\text{Model B: }y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \beta_4 x_4 + \epsilon\end{aligned}$$`

Model A is nested in Model B. We could use likelihoods to test whether it is useful to add `$x_3$` and `$x_4$` to the model.

`$$\begin{aligned}&H_0: \beta_3 = \beta_4 = 0 \\ 
&H_a: \text{ at least one }\beta_j \text{ is not equal to 0}\end{aligned}$$`

---

## Nested models

**Another way to think about nested models**: Parameters in larger model can be equated to get the simpler model or if some parameters can be set to constants

Example:

`$$\begin{aligned}&\text{Model 1: }p_H \\
&\text{Model 2: }p_{H| N}, p_{H| H Bias}, p_{H| V Bias}\end{aligned}$$`

Model 1 is nested in Model 2. The parameters `$p_{H| N}$`, `$p_{H|H Bias}$`, and `$p_{H |V Bias}$` can be set equal to `$p_H$` to get Model 1.

`$$\begin{aligned}&H_0: p_{H| N} = p_{H| H Bias} = p_{H| V Bias} = p_H \\
&H_a: \text{At least one of }p_{H| N}, p_{H| H Bias}, p_{H| V Bias} \text{ differs from the others}\end{aligned}$$`

---

## Steps to compare models

1️⃣  Find the MLEs for each model.

2️⃣  Plug the MLEs into the log-likelihood function for each model to get the maximum value of the log-likelihood for each model.

3️⃣  Find the difference in the maximum log-likelihoods

4️⃣  Use the Likelihood Ratio Test to determine if the difference is statistically significant

---

## Steps 1 - 2

Find the MLEs for each model and plug them into the log-likelihood functions.

- `$\hat{p}_H = 46/90 = 0.511$`

```r
loglik1 <- function(ph){
 log(ph^46 * (1 - ph)^44)
}
loglik1(46/90)
```

```
## [1] -62.36102
```
]]

- `$\hat{p}_{H|N} = 25 / 48 = 0.521$`
- `$\hat{p}_{H|H Bias} = 8 /20 = 0.4$`
- `$\hat{p}_{H|V Bias} = 13/ 22 = 0.591$`

```r
loglik2 <- function(phn, phh, phv) {
 log(phn^25 * (1 - phn)^23 * phh^8 * 
 (1 - phh)^12 * phv^13 * (1 - phv)^9)
}
loglik2(25/48, 8/20, 13/22)
```

```
## [1] -61.57319
```
]
]

---

## Step 3

Find the difference in the log-likelihoods

```r
(diff <- loglik2(25/48, 8/20, 13/22) - loglik1(46/90))
```

```
## [1] 0.7878318
```

## Likelihood Ratio Test

**Test statistic**

`$$\begin{aligned} LRT &= 2[\max\{\log(Lik(\text{larger model}))\} - \max\{\log(Lik(\text{reduced model}))\}]\\[10pt]
&= 2\log\Bigg(\frac{\max\{(Lik(\text{larger model})\}}{\max\{(Lik(\text{reduced model})\}}\Bigg)\end{aligned}$$`

LRT follows a `$\chi^2$` distribution where the degrees of freedom equal the difference in the number of parameters between the two models

---

## Step 4

```r
(LRT <- 2 * (loglik2(25/48, 8/20, 13/22) - loglik1(46/90)))
```

```
## [1] 1.575664
```

The test statistic follows a `$\chi^2$` distribution with 2 degrees of freedom. Therefore, the p-value is `$P(\chi^2 > LRT)$`.

```r
pchisq(LRT, 2, lower.tail = FALSE)
```

```
## [1] 0.4548299
```

The p-value is very large, so we fail to reject `$H_0$`. We do not have convincing evidence that the conditional model is an improvement over the unconditional model. Therefore, we can stick with the unconditional model.

---

## Comparing non-nested models

---

## Comparing non-nested models

```r
(Model1_AIC <- 2 * loglik1(46/90) + 2 * 1)
```

```
## [1] -122.722
```

```r
(Model2_AIC <-2 * loglik2(25/48, 8/20, 13/22) + 2 * 3)
```

```
## [1] -117.1464
```
]

```r
(Model1_BIC <- 2 * loglik1(46/90) + 1 * log(30))
```

```
## [1] -121.3208
```

```r
(Model2_BIC <-2 * loglik2(25/48, 8/20, 13/22) + 3 * log(30))
```

```
## [1] -112.9428
```
]

**Choose Model 1, the unconditional model, based on AIC and BIC**

---

## Looking ahead

- Likelihoods help us answer the question of how likely we are to observe the data given different parameters

- In this example, we did not consider covariates, so in practice the parameters we want to estimate will look more similar to this

`$$p_H = \frac{e^{\beta_0 + \beta_1x_1 + \dots + \beta_px_p}}{1 + e^{\beta_0 + \beta_1x_1 + \dots + \beta_px_p}}$$`

- Finding the MLE becomes much more complex and numerical methods may be required.
  - We will primarily rely on software to find the MLE, but the conceptual ideas will be the same
  
---

## Response variable in mini-project 01

---

## Activity Instructions

.midi[The goal of this activity is for your team to start exploring the response variable for your mini-project 01 analysis. The properties explored in this activity are ones you will consider throughout the semester as you decide which GLM is most appropriate for a given data set. Use [Table 3.1 in BMLR](https://bookdown.org/roback/bookdown-BeyondMLR/ch-distthry.html#additional-resources) for reference.]

- .small[What is the response variable? What is its definition?]
- .small[Is the response variable discrete or continuous?]
- .small[What possible values can it take? (not necessarily just the values in the data set)]
- .small[What is the name of the distribution the variable follows?]
- .small[What is/are the parameter(s) for this distribution? Estimate the parameters from the data.] 
- .small[Visualize the distribution of the response variable. Is this what you expected? Why or why not?]

---

## Activity Instructions

.midi[[Click here](https://docs.google.com/presentation/d/1PlhwggKe4VclDQq8c6ZhYNKMWkFP2DOpV6fPoahpC44/edit?usp=sharing) to put the answers on your team's slide.]

You can add any analysis to the bottom of the `proposal.Rmd` document in your team's project repo.

---

## Looking ahead

- Review [Chapter 3 - Distribution Theory](https://bookdown.org/roback/bookdown-BeyondMLR/ch-distthry.html)
  - Use this chapter as a reference throughout the semester
  
- For next time - [Chapter 4 - Poisson Regression](https://bookdown.org/roback/bookdown-BeyondMLR/ch-poissonreg.html)

---

## Acknowledgements

These slides are based on content in [BMLR Chapter 2 - Beyond Least Squares: Using Likelihoods](https://bookdown.org/roback/bookdown-BeyondMLR/ch-beyondmost.html)