For this project you and your team will be reading and evaluating an
article that uses multilevel or longitudinal modeling in their
analysis.
This project will focus on reading and evaluating published research
(rather than reproducing results as in mini-project 01). The goals of
the project are to (1) explore how multilevel models are used in
research, (2) understand how statistical results are presented in
academic research.
Team assignments
You will work in teams of 3 - 4 students for this project. Click
here for your team assignment.
Before getting stared on the proposal:
- Come up with a fun team name.
- Come up with a plan to communicate and work together outside of
lab.
- Come up with a plan for remote work if some team members are unable
to attend lab or other team meetings in person.
- Click here to
submit your new team name.
Workflow
Project Week 01 (week of Mon, Feb 21): Select
article and submit proposal.
Project Week 02 (week of Mon, Feb 28): Read
article and complete article evaluation.
Project Week 03 (week of Mon, Mar 14): Work on
draft reports and presentations
Project Week 04 (week of Mon, Mar 21): Peer
review. Finalize reports and presentations.
Project Week 05 (week of Mon, Mar 28):
Presentations and submit report.
(Note: Spring Break week of Mar 07)
Due dates
All work for the project will be submitted on GitHub.
Proposal: due Fri, Feb 25 at 11:59pm
Article evaluation: due Fri, Mar 04 at
11:59pm
Draft for peer review: due Thu, Mar 24 at 12pm
(noon)
Peer reviews: due Thu, Mar 24 at
11:59pm
Presentation: due Mon, Mar 28 at 3:30pm
Written report: due Mon, Mar 28 at
11:59pm
Article
The article for this project must be
published in an academic journal. Please ask a member of the teaching
team if you are unsure whether the article is published in an academic
journal. The article must incorporate the use of one or more models for
analyzing correlated data (i.e., data with a multilevel structure) in
the analysis. The model used in the paper does not have to be one we
discuss in class. I’d encourage you to explore articles that use
modeling beyond the scope of the class!
There does not have to be an accompanying data, as
there is no required data analysis component to this project.
Use the terms “multilevel model” or “longitudinal model” along with
any other topic key words to help find articles with relevant models.
Below are a few useful places to search for articles:
See the Tips on finding
articles section for tips on searching databases and Example articles section for articles.
Proposal
The proposal should include the following:
The citation for the article. If you’re using a
.bib
file you can use the default citation format in R
Markdown (Chicago author-date format). Otherwise, use MLA
format.
Brief summary about why you chose this article.
Brief summary of the article’s primary research
objective.
A description of the data analyzed in the article. Include
- A description of the observational units at each level. (Note:
Most articles will have level-one and level-two observational units, but
some may have more levels to the data structure.)
- A description of the response variable.
- A description of within-group variability.
- A description of the fixed and random effects.
You are only required to write the proposal for one article. Write
the proposal in the file proposal.Rmd
, then push the .Rmd
and knitted PDF to the GitHub repo by the due date for submission.
Grading criteria
The proposal will be graded based on the following:
Article evaluation
The purpose of the article evaluation is for you to begin describing
and evaluating the statistical analysis and argument in the article.
Write your responses to the following questions in
article-evaluation.Rmd
. The anticipated length is about 1 -
2 pages and should be no more than 4 pages. There is no minimum page
requirement, as long as each section is comprehensively addressed.
It is due on GitHub by Friday, March 04 at
11:59pm.
- Audience and purpose
- Who is the primary audience for this article, i.e., for what type of
readers are the authors writing?
- What is the general purpose of the article, e.g., to persuade the
reader to do something, to prove something, to inform the reader,
etc.?
- Exploratory data analysis
- Describe the type of visualizations, tables, and descriptive
statistics used to explore and summarize the data. At this point, you
don’t need to interpret the EDA, but instead describe what is
done for the EDA.
- What new visualizations, tables, or statistics might you add to the
article? Briefly explain.
- Multilevel Model (See BMLR
Sections 8.4 - 8.5 for more details on mutlilevel models.)
- What is the response variable, and what is its distribution?
- Write the Level One, Level Two, and Level Three (if applicable)
statistical models in mathematical notation. Note: The statistical
model is the model with the population parameters the authors want to
estimate, not the equation with the estimated
coefficients.
- Write the composite model in mathematical notation.
- What are the coefficients for the fixed effects? What does each
coefficient represent?
- What are the error terms? What does each error term represent?
Grading criteria
Point for the Article Evaluation are as follows
Audience and purpose |
3 |
Exploratory data analysis |
3 |
Multilevel model |
4 |
Total |
10 |
Each section will be graded for completeness and accuracy. The
accuracy for the Mutlilevel model section will be graded taking
into account what we have covered in class thus far.
Draft + peer review
Draft
The draft of the final report is due on Thu, Mar 24 at 12pm
(noon). You should write the draft in the
writeup.Rmd
document.
At a minimum, the draft should incorporate the feedback from the
article evaluation.
Peer Review
Each team will review the drafts of two other teams. You will work on
the peer review during lab on Thu, Mar 24 and it is due
no later than 11:59pm that day. When you log into GitHub, you will have
read access to the two repos you’re reviewing.
Click
here for the peer review assignments.
You should discuss the peer review as a team, but only one team
member needs to submit the review on GitHub. Every team member should
contribute to the discussion and the team’s responses to the peer review
questions.
You will submit the peer review as an Issue in each team’s repo. To
do so:
- Go to the team’s repo and click Issues.
- Click New issue.
- You will see a template that says “Peer review”. Click Get
started, and it will open a new issue.
- Add your team name and the names of the team members who worked on
that review. Then, type your responses under each question header.
Grading criteria (10 pts)
The draft and peer review will be graded based on the following:
Draft is comprehensive and at a minimum incorporates the feedback
from the article evaluation. (5 pts)
Peer review thoroughly addresses the questions in the template.
The feedback is comprehensive and accurate. ( 5 pts)
- Note that writing a comprehensive review does not necessarily mean
the review needs to be lengthy. The objective is to provide feedback
that is helpful to the team as they finalize their analysis.
- Points for the peer review will be assigned individually based on
the team members who contributed to the respective review.
Write up
The final write up is due on Mon, Mar 28 at 11:59pm
on GitHub. The anticipated length is about 5 pages.
Introduction
Briefly summarize the article, the research objective and purpose,
and key conclusions. Also include a description of the data used for the
analysis.
Methods
Describe the multilevel model used for the analysis. Describe the
response variable and and its distribution. Describe the fixed and
random effects. Write the Level One, Level Two, Level Three (if
applicable), and composite statistical models using mathematical
notation.
Results
Interpret the results form the model. Write the interpretations /
conclusions from the fixed effects. Describe the meaning of the error
terms and variance components. Use the estimated coefficients for each
component as able. If the article does not include estimates for some or
all of the estimated fixed effects, variance components, or error terms,
you can write general interpretations using the appropriate mathematical
symbol (e.g., \(\hat{\beta}_1\)) in
place of the estimated value.
Communication
The objective of the final section of the written report is to assess
the authors’ argument and communication. Reading and identifying how
others communicate statistical results is a key way to develop your
statistical writing skills. This section will include an assessment of
the following:
Audience: Describe the primary audience for the
article.
Methods: Consider the detail in the data and
methods sections. What aspects of the analysis are mentioned in detail?
What aspects are mentioned without detail? How does the level of detail
correlate to the statistical background of the primary
audience?
Graphs and figures: How are the graphs, figures,
and tables used to support the findings? How are they used in the
exploratory data analysis? How how are they used to assess or support
modeling results? Would additional graphs, figures, or tables be
helpful? If so, what kind?
- Identify one key graph. Where is it located in the article? What
message does it convey with respect to the objective and conclusion of
the study? If there are no graphs in the article, describe one key graph
you would include and how it would be used in the article (e.g., support
conclusions, provide clarity, etc.).
Limitations: Are there limitations or
difficulties in generalizing beyond the data? How are these limitations
noted, if at all? Do you have any other concerns about the
study?
Impact: According to the author, how does the
study advance knowledge in the field? Taking into account the year the
article was published, do the author’s claims seem adequately justified,
overblown, or unduly cautious?
You can use these questions as a guide to shape the narrative. This
section should still be written in narrative form, not as a list of
questions and answers.
The questions in this section are adapted from Communicating with
Data: The Art of Writing for Data Science by Deborah Nolan and Sara
Stoudt. Click
here for more details about the questions and an example on Sakai.
You can borrow a copy of the book from Duke Libraries.
Grading criteria
Each section will assessed on whether the components of the section
are clearly, comprehensively, and accurately discussed in the report.
The point allocation is as follows:
- Introduction: 5 pts
- Methods: 10 pts
- Results: 8 pts
- Communication: 8 pts
The report will also be assessed based on the following:
- Formatting: 2 pts
- This is an assessment of the overall presentation and formatting of
the written report. This includes neatly formatted text and tables,
appropriate labels on figures, suppressing all code and extraneous
output, properly rendered LaTex, etc.
- Reproducibility: 2 pts
- This is an assessment of the reproducibility of the report. Is the
PDF produced by knitting the .Rmd document?
Presentation
You will present on Mon, Mar 28 during lecture. Each
team will have 6 minutes for the presentation along with a few minutes
for questions, and every team member should speak about an equal amount
of time during the presentation.
You can make the presentation slides using the software of your
choice. You can use as many slide as you wish, just be mindful of what
can reasonably be presentation in 6 minutes. A suggested outline is
- 1 slide to introduce article
- 1 - 2 slides to describe the model
- 1 - 2 slides for key interpretations and results
- 1 slide for key highlights about the communication and writing
(e.g., what the authors did particularly well or areas of
improvement)
You will be assigned two presentations to peer review. You must
submit the peer review scores for both presentations to have the “Peers”
scores for your team’s presentation included in your presentation
grade.
The presentation order and peer review assignments will be given
closer to the presentation date.
The presentation order is as follows:
- PAMN
- Degenerate Distributions
- JARK
- MLT
- Integrals
- ggplot3
- JAVA
Grading criteria - Teaching Team (16 pts)
This portion of the grade will the average of the scores from the
members of the teaching team.
- Time management (2 pts)
- Was the time reasonably divided among team members? Was the
presentation within the time limit?
- Professionalism (2 pts)
- Was the team prepared for the presentation? Did each team member
have a meaningful contribution to the presentation?
- Teamwork (2 pts)
- Did the team present a unified story?
- Slides (4 pts)
- Are the slides well organized, readable, not full of text, featuring
figures with legible labels, legends, etc.?
- Content (6 pts): The content is presented in a
clear and accurate way. This includes clearly and accurately describing
- the primary research objective and intended audience for the
article.
- the data used in the analysis.
- the observations and units at every level.
- the model and primary conclusions
- the effectiveness of tables, figures, and graphs in the article and
argument of the contribution of the work (optional).
Grading criteria - Peers (4 pts)
This portion of the grade will the average of the scores from the
peer reviewers. Click
here for peer review assignments.
- Introduction (1 pt)
- Did the team clearly describe the primary research objective and
intended audience for the article?
- Data (1 pt)
- Did the team clearly describe the data used in the analysis?
- Model (1 pt)
- Did the team clearly describe the observational units and variables
at each level? Did they clearly describe the model?
- Slides (1 pt)
- Are the slides well organized, readable, not full of text, featuring
figures with legible labels, legends, etc.?
GitHub repo organization
You should have the following files and folders in the project repo.
The repo and brief summary in the README should be updated by
Mon, Mar 28 at 11:59pm.
README.md
: 3 - 5 sentence summary of the
project
/proposal
: Folder for project proposal
/proposal/proposal.Rmd
: R Markdown file for
proposal
/proposal/proposal.pdf
: Knitted PDF of proposal
/article-evaluation/
: Folder for article
evaluation
/article-evaluation/article-evaluation.Rmd
: R Markdown
file for article evaluation
/article-evaluation/article-evaluation.pdf
: Knitted PDF
of article evaluation
/writeup/
: Folder for write up
/writeup/writeup.Rmd
: R Markdown file for write up
/writeup/writeup.pdf
: Knitted PDF of write up
/presentation
: Folder for presentation
/presentation/*
: Presentation file (if not linked in
README)
/presentation/README.md
: Link to project (if not in
presentation folder)
Optional - /data/
: The data set -
/data/*
: File containing data set -
/data/README.md
: Codebook for data set.
Grading (100 points)
Proposal |
10 pts |
Article evaluation |
10 pts |
Draft + Peer review |
10 pts |
Written report |
35 pts |
Presentation |
20 pts |
Organization |
5 pts |
Teamwork evaluation |
10 pts |
Tips on finding articles
Below are tips to help you find articles based on information from Jodi
Psoter, the Librarian for Chemistry and Statistical Science at Duke
Libraries.
PubMed
Articles in health-related fields
The PubMed
heading tree lets you search by topic. The link will direct you to
the results under the category of “Statistics as a Topic”.
- Click on the model / analysis of interest.
- Click “Add to search builder” under the PubMed Search
Builder in the top right corner. You should now see the
model/analysis type you chose in the search box.
- Click “Search PubMed”, and a page of search results will
appear.
- There are options to narrow your results on the left-hand side.
Under Article Attributes, check “Associated Data”, to limit the
results to articles with data sets available.
You can use the other search options to narrow down results based on
your team’s interests.
PsycInfo
Articles in psychology
PsycInfo
will allow users to search by analysis type.
Put the name of the model in the search bar, e.g., “Poisson
Regression”. Then, in the drop down menu next to the search bar, select
“DE Subjects [exact]”. Click Search.
On the left-hand side, under Limit To, check “Open
Access”. This will not guarantee the article has an associated data set,
but a lot of open access articles will make the data available or
utilize publicly accessible data you could pull from another
source.
Web of Science
Articles on all topics
Web
of Science Data Citation Index lets you search for data sets based
on the topic of interest.
Use the search bar to search based on a topic of interest. You
can also search for the model / analysis type.
On the left-hand side, check “Data Set” under Content
Type and check “Dataset” under Data Types. Click “Refine”
to limit the results.
3.Click on the article of interest.
- Click the DOI link in the article metadata. The article
should have a data availability statement or something similar with
information on accessing the data.
Example articles
Below are example articles you can use for the project. You are
welcome to (but not required to) to use one of these articles. Up to two
teams may use a given article for the project.
Acknowledgements
- Grading criteria and the repo organization for this project were
adapted from Project 1 on vizdata.org.
- Some questions for the Article Evaluation adapted from “How to
Evaluate Journal Articles”.
- Questions in the “Communication” section of the Written
Report are adapted from Communicating with Data: The Art of
Writing for Data Science by Deborah Nolan and Sara Stoudt.