Homework 04

due Fri, April 01 at 11:59pm

Instructions

Exercises

  1. Brown and Uyar (2004) describe “A Hierarchical Linear Model Approach for Assessing the Effects of House and Neighborhood Characteristics on Housing Prices”.1 from Sec 8.13.1, Ex 1 - 3

    a. Give the observational units at Level One and Level Two based on the title of the paper.

    b. Why can’t we assume all houses in the data set are independent? What would be the potential implications to our analysis of assuming independence among houses?

    c. Suppose we have the following set of predictors: Square footage, age of house, rating of neighborhood schools, median neighborhood housing price

    • Write the two-level model for predicting housing prices. Write the full model such that (1) the Level Two predictors are used to estimate the intercept and slope for each Level One predictor, and (2) there are random effects for the slopes and intercepts.
    • Write the corresponding composite model.
    • Write the model parameters (fixed effects and variance components) that must be estimated.
  2. Why is Model A in Section 8.6.2 sometimes called the “unconditional means model”? Why is it also sometimes called the “random intercepts model”? Are these two labels consistent with each other? Briefly explain. 2 from Sec 8.13.1, Ex 8

  3. In Table 8.3, the standard errors associated with estimated coefficients under independence are lower than standard errors under alternative analysis methods. Briefly explain why that is often the case.3 from BMLR Sec 8.13.1, Ex 7


Use the following prompt for Exercises 4 - 6.

One response to emergency department overcrowding is “ambulance diversion”—closing its doors and forcing ambulances to bring patients to alternative hospitals. The California Office of Statewide Health Planning and Development collected data on how often hospitals enacted “diversion status”, enabling researchers to investigate factors associated with increasing amounts of ambulance diversions. An award-winning student project (Fisher, Murney, and Radtke 2019) examined a data set (ambulance3.csv) which contains information from 184 California hospitals over a 3-year period (2013-2015). The codebook for key variables is available in the data folder of your GitHub repo.4 from Sec 9.9.2, Ex 2

  1. a. Create spaghetti plots that illustrate diversion hours over time faceted by (1) EMS level, and another by (2) number of stations (divided into “high” (stations > 23) and “low”). Describe terms that might be worth testing in a model based on these plots.

    b. Fit and display the unconditional growth model.

    c.  Interpret \(\hat{\alpha}_0\) in the context of the data.

    d.  Interpret \(\hat{\sigma}_v\) in the context of the data.

  2. We wish to compare the models \(D\) and \(D0\) below.

modelD <- lmer(diverthours ~ year2013 + ems_basic + 
  (year2013 | id), data = ambulance3)

modelD0 <- lmer(diverthours ~ year2013 + ems_basic + 
  (1 | id), data = ambulance3)

a. Write out null and alternative hypotheses in terms of model parameters.

b. State a conclusion based on a likelihood ratio test using the \(\chi^2\) distribution.

c.  State a conclusion based on a likelihood ratio test using a parametric bootstrap.

d.  Why might we consider using a parametric bootstrap p-value rather than a likelihood ratio test p-value calculated from the \(\chi^2\) distribution?

  1. Consider the code for Model E below:
modelE <- lmer(diverthours ~ year2013 + ems_basic +
  ems_basic:year2013 + (year2013 | id), data = ambulance3)

a. Write the composite model in mathematical notation.

b. Interpret the coefficient of ems_basic:year2013 in the context of the data.

c.  Is the interaction term ems_basic:year2013 statistically significant? Briefly explain your response showing any relevant code, output, or calculations to support your response.

Submission

Before you wrap up the assignment, make sure all documents are updated in your GitHub repo.

To submit your assignment:

The PDF must be submitted to Gradescope by the deadline to be considered on time.

Grading

Total 50
Ex 1 8
Ex 2 3
Ex 3 3
Ex 4 12
Ex 5 10
Ex 6 12
Workflow & formatting 2

The “Workflow & formatting” grade is based on the organization of the assignment write up along with the reproducible workflow. This includes having an organized write up with neat and readable headers, code, and narrative, including properly rendered mathematical notation. It also includes having a reproducible R Markdown document that can be knitted to reproduce the submitted PDF and implementing version control using multiple commits with informative commit messages.

Acknowledgements

All exercises are pulled or adapted from Beyond Multiple Linear Regression.

References

Brown, Kenneth, and Bulent Uyar. 2004. “A Hierarchical Linear Model Approach for Assessing the Effects of House and Neighborhood Characteristics on Housing Prices.” Journal of Real Estate Practice and Education 7 (1): 15–24. http://aresjournals.org/doi/abs/10.5555/repe.7.1.f687057161743261.
Fisher, Lisa, Katie Murney, and Tyler Radtke. 2019. “Emergency Department Overcrowding and Factors That Contribute to Ambulance Diversion.”