tidy
and kable
functions
can help!)Brown and Uyar (2004) describe “A Hierarchical Linear Model Approach for Assessing the Effects of House and Neighborhood Characteristics on Housing Prices”.1 from Sec 8.13.1, Ex 1 - 3
a. Give the observational units at Level One and Level Two based on the title of the paper.
b. Why can’t we assume all houses in the data set are independent? What would be the potential implications to our analysis of assuming independence among houses?
c. Suppose we have the following set of predictors: Square footage, age of house, rating of neighborhood schools, median neighborhood housing price
Why is Model A in Section 8.6.2 sometimes called the “unconditional means model”? Why is it also sometimes called the “random intercepts model”? Are these two labels consistent with each other? Briefly explain. 2 from Sec 8.13.1, Ex 8
In Table 8.3, the standard errors associated with estimated coefficients under independence are lower than standard errors under alternative analysis methods. Briefly explain why that is often the case.3 from BMLR Sec 8.13.1, Ex 7
Use the following prompt for Exercises 4 - 6.
One response to emergency department overcrowding is “ambulance
diversion”—closing its doors and forcing ambulances to bring patients to
alternative hospitals. The California Office of Statewide Health
Planning and Development collected data on how often hospitals enacted
“diversion status”, enabling researchers to investigate factors
associated with increasing amounts of ambulance diversions. An award-winning
student project (Fisher, Murney, and Radtke
2019) examined a data set (ambulance3.csv
) which
contains information from 184 California hospitals over a 3-year period
(2013-2015). The codebook for key variables is available in the
data
folder of your GitHub repo.4 from Sec 9.9.2, Ex 2
a. Create spaghetti plots that illustrate diversion hours over time faceted by (1) EMS level, and another by (2) number of stations (divided into “high” (stations > 23) and “low”). Describe terms that might be worth testing in a model based on these plots.
b. Fit and display the unconditional growth model.
c. Interpret \(\hat{\alpha}_0\) in the context of the data.
d. Interpret \(\hat{\sigma}_v\) in the context of the data.
We wish to compare the models \(D\) and \(D0\) below.
<- lmer(diverthours ~ year2013 + ems_basic +
modelD | id), data = ambulance3)
(year2013
<- lmer(diverthours ~ year2013 + ems_basic +
modelD0 1 | id), data = ambulance3) (
a. Write out null and alternative hypotheses in terms of model parameters.
b. State a conclusion based on a likelihood ratio test using the \(\chi^2\) distribution.
c. State a conclusion based on a likelihood ratio test using a parametric bootstrap.
d. Why might we consider using a parametric bootstrap p-value rather than a likelihood ratio test p-value calculated from the \(\chi^2\) distribution?
<- lmer(diverthours ~ year2013 + ems_basic +
modelE :year2013 + (year2013 | id), data = ambulance3) ems_basic
a. Write the composite model in mathematical notation.
b. Interpret the coefficient of
ems_basic:year2013
in the context of the data.
c. Is the interaction term
ems_basic:year2013
statistically significant? Briefly
explain your response showing any relevant code, output, or calculations
to support your response.
Before you wrap up the assignment, make sure all documents are updated in your GitHub repo.
To submit your assignment:
Go to http://www.gradescope.com and click Log in in the top right corner.
Click School Credentials ➡️ Duke NetID and log in using your NetID credentials.
Click on your STA 310 course.
Click on the assignment, and you’ll be prompted to submit it.
Mark the pages associated with each exercise. All of the pages of your assignment should be associated with at least one question (i.e., should be “checked”).
Select the first page of your .PDF submission to be associated with the “Workflow & formatting” section.
The PDF must be submitted to Gradescope by the deadline to be considered on time.
Total | 50 |
---|---|
Ex 1 | 8 |
Ex 2 | 3 |
Ex 3 | 3 |
Ex 4 | 12 |
Ex 5 | 10 |
Ex 6 | 12 |
Workflow & formatting | 2 |
The “Workflow & formatting” grade is based on the organization of the assignment write up along with the reproducible workflow. This includes having an organized write up with neat and readable headers, code, and narrative, including properly rendered mathematical notation. It also includes having a reproducible R Markdown document that can be knitted to reproduce the submitted PDF and implementing version control using multiple commits with informative commit messages.
All exercises are pulled or adapted from Beyond Multiple Linear Regression.