For this project you and your team will be reading and evaluating an article that uses multilevel or longitudinal modeling in their analysis.

This project will focus on reading and evaluating published research (rather than reproducing results as in mini-project 01). The goals of the project are to (1) explore how multilevel models are used in research, (2) understand how statistical results are presented in academic research.

Team assignments

You will work in teams of 3 - 4 students for this project. Click here for your team assignment.

Before getting stared on the proposal:

  • Come up with a fun team name.
  • Come up with a plan to communicate and work together outside of lab.
  • Come up with a plan for remote work if some team members are unable to attend lab or other team meetings in person.
  • Click here to submit your new team name.

Workflow

  • Project Week 01 (week of Mon, Feb 21): Select article and submit proposal.

  • Project Week 02 (week of Mon, Feb 28): Read article and complete article evaluation.

  • Project Week 03 (week of Mon, Mar 14): Work on draft reports and presentations

  • Project Week 04 (week of Mon, Mar 21): Peer review. Finalize reports and presentations.

  • Project Week 05 (week of Mon, Mar 28): Presentations and submit report.

(Note: Spring Break week of Mar 07)

Due dates

All work for the project will be submitted on GitHub.

  • Proposal: due Fri, Feb 25 at 11:59pm

  • Article evaluation: due Fri, Mar 04 at 11:59pm

  • Draft for peer review: due Thu, Mar 24 at 12pm (noon)

  • Peer reviews: due Thu, Mar 24 at 11:59pm

  • Presentation: due Mon, Mar 28 at 3:30pm

  • Written report: due Mon, Mar 28 at 11:59pm

Article

The article for this project must be published in an academic journal. Please ask a member of the teaching team if you are unsure whether the article is published in an academic journal. The article must incorporate the use of one or more models for analyzing correlated data (i.e., data with a multilevel structure) in the analysis. The model used in the paper does not have to be one we discuss in class. I’d encourage you to explore articles that use modeling beyond the scope of the class!

There does not have to be an accompanying data, as there is no required data analysis component to this project.

Use the terms “multilevel model” or “longitudinal model” along with any other topic key words to help find articles with relevant models. Below are a few useful places to search for articles:

See the Tips on finding articles section for tips on searching databases and Example articles section for articles.

Proposal

The proposal should include the following:

  • The citation for the article. If you’re using a .bib file you can use the default citation format in R Markdown (Chicago author-date format). Otherwise, use MLA format.

  • Brief summary about why you chose this article.

  • Brief summary of the article’s primary research objective.

  • A description of the data analyzed in the article. Include

    • A description of the observational units at each level. (Note: Most articles will have level-one and level-two observational units, but some may have more levels to the data structure.)
    • A description of the response variable.
    • A description of within-group variability.
    • A description of the fixed and random effects.

You are only required to write the proposal for one article. Write the proposal in the file proposal.Rmd, then push the .Rmd and knitted PDF to the GitHub repo by the due date for submission.

Grading criteria

The proposal will be graded based on the following:

  • All required components of the proposal are included and accurate (8 pts)

  • All team members have contributed (2 pts)

Article evaluation

The purpose of the article evaluation is for you to begin describing and evaluating the statistical analysis and argument in the article. Write your responses to the following questions in article-evaluation.Rmd. The anticipated length is about 1 - 2 pages and should be no more than 4 pages. There is no minimum page requirement, as long as each section is comprehensively addressed.

It is due on GitHub by Friday, March 04 at 11:59pm.

  • Audience and purpose
    • Who is the primary audience for this article, i.e., for what type of readers are the authors writing?
    • What is the general purpose of the article, e.g., to persuade the reader to do something, to prove something, to inform the reader, etc.?
  • Exploratory data analysis
    • Describe the type of visualizations, tables, and descriptive statistics used to explore and summarize the data. At this point, you don’t need to interpret the EDA, but instead describe what is done for the EDA.
    • What new visualizations, tables, or statistics might you add to the article? Briefly explain.
  • Multilevel Model (See BMLR Sections 8.4 - 8.5 for more details on mutlilevel models.)
    • What is the response variable, and what is its distribution?
    • Write the Level One, Level Two, and Level Three (if applicable) statistical models in mathematical notation. Note: The statistical model is the model with the population parameters the authors want to estimate, not the equation with the estimated coefficients.
    • Write the composite model in mathematical notation.
    • What are the coefficients for the fixed effects? What does each coefficient represent?
    • What are the error terms? What does each error term represent?

Grading criteria

Point for the Article Evaluation are as follows

Section Total points
Audience and purpose 3
Exploratory data analysis 3
Multilevel model 4
Total 10

Each section will be graded for completeness and accuracy. The accuracy for the Mutlilevel model section will be graded taking into account what we have covered in class thus far.

Draft + peer review

Draft

The draft of the final report is due on Thu, Mar 24 at 12pm (noon). You should write the draft in the writeup.Rmd document.

At a minimum, the draft should incorporate the feedback from the article evaluation.

Peer Review

Each team will review the drafts of two other teams. You will work on the peer review during lab on Thu, Mar 24 and it is due no later than 11:59pm that day. When you log into GitHub, you will have read access to the two repos you’re reviewing.

Click here for the peer review assignments.

You should discuss the peer review as a team, but only one team member needs to submit the review on GitHub. Every team member should contribute to the discussion and the team’s responses to the peer review questions.

You will submit the peer review as an Issue in each team’s repo. To do so:

  • Go to the team’s repo and click Issues.
  • Click New issue.
  • You will see a template that says “Peer review”. Click Get started, and it will open a new issue.
  • Add your team name and the names of the team members who worked on that review. Then, type your responses under each question header.

Grading criteria (10 pts)

The draft and peer review will be graded based on the following:

  • Draft is comprehensive and at a minimum incorporates the feedback from the article evaluation. (5 pts)

  • Peer review thoroughly addresses the questions in the template. The feedback is comprehensive and accurate. ( 5 pts)

    • Note that writing a comprehensive review does not necessarily mean the review needs to be lengthy. The objective is to provide feedback that is helpful to the team as they finalize their analysis.
    • Points for the peer review will be assigned individually based on the team members who contributed to the respective review.

Write up

The final write up is due on Mon, Mar 28 at 11:59pm on GitHub. The anticipated length is about 5 pages.

Introduction

Briefly summarize the article, the research objective and purpose, and key conclusions. Also include a description of the data used for the analysis.

Methods

Describe the multilevel model used for the analysis. Describe the response variable and and its distribution. Describe the fixed and random effects. Write the Level One, Level Two, Level Three (if applicable), and composite statistical models using mathematical notation.

Results

Interpret the results form the model. Write the interpretations / conclusions from the fixed effects. Describe the meaning of the error terms and variance components. Use the estimated coefficients for each component as able. If the article does not include estimates for some or all of the estimated fixed effects, variance components, or error terms, you can write general interpretations using the appropriate mathematical symbol (e.g., \(\hat{\beta}_1\)) in place of the estimated value.

Communication

The objective of the final section of the written report is to assess the authors’ argument and communication. Reading and identifying how others communicate statistical results is a key way to develop your statistical writing skills. This section will include an assessment of the following:

  • Audience: Describe the primary audience for the article.

  • Methods: Consider the detail in the data and methods sections. What aspects of the analysis are mentioned in detail? What aspects are mentioned without detail? How does the level of detail correlate to the statistical background of the primary audience?

  • Graphs and figures: How are the graphs, figures, and tables used to support the findings? How are they used in the exploratory data analysis? How how are they used to assess or support modeling results? Would additional graphs, figures, or tables be helpful? If so, what kind?

    • Identify one key graph. Where is it located in the article? What message does it convey with respect to the objective and conclusion of the study? If there are no graphs in the article, describe one key graph you would include and how it would be used in the article (e.g., support conclusions, provide clarity, etc.).
  • Limitations: Are there limitations or difficulties in generalizing beyond the data? How are these limitations noted, if at all? Do you have any other concerns about the study?

  • Impact: According to the author, how does the study advance knowledge in the field? Taking into account the year the article was published, do the author’s claims seem adequately justified, overblown, or unduly cautious?

You can use these questions as a guide to shape the narrative. This section should still be written in narrative form, not as a list of questions and answers.

The questions in this section are adapted from Communicating with Data: The Art of Writing for Data Science by Deborah Nolan and Sara Stoudt. Click here for more details about the questions and an example on Sakai. You can borrow a copy of the book from Duke Libraries.

Grading criteria

Each section will assessed on whether the components of the section are clearly, comprehensively, and accurately discussed in the report. The point allocation is as follows:

  • Introduction: 5 pts
  • Methods: 10 pts
  • Results: 8 pts
  • Communication: 8 pts

The report will also be assessed based on the following:

  • Formatting: 2 pts
    • This is an assessment of the overall presentation and formatting of the written report. This includes neatly formatted text and tables, appropriate labels on figures, suppressing all code and extraneous output, properly rendered LaTex, etc.
  • Reproducibility: 2 pts
    • This is an assessment of the reproducibility of the report. Is the PDF produced by knitting the .Rmd document?

Presentation

You will present on Mon, Mar 28 during lecture. Each team will have 6 minutes for the presentation along with a few minutes for questions, and every team member should speak about an equal amount of time during the presentation.

You can make the presentation slides using the software of your choice. You can use as many slide as you wish, just be mindful of what can reasonably be presentation in 6 minutes. A suggested outline is

  • 1 slide to introduce article
  • 1 - 2 slides to describe the model
  • 1 - 2 slides for key interpretations and results
  • 1 slide for key highlights about the communication and writing (e.g., what the authors did particularly well or areas of improvement)

You will be assigned two presentations to peer review. You must submit the peer review scores for both presentations to have the “Peers” scores for your team’s presentation included in your presentation grade.

The presentation order and peer review assignments will be given closer to the presentation date.

The presentation order is as follows:

  1. PAMN
  2. Degenerate Distributions
  3. JARK
  4. MLT 
  5. Integrals
  6. ggplot3
  7. JAVA

Grading criteria - Teaching Team (16 pts)

This portion of the grade will the average of the scores from the members of the teaching team.

  • Time management (2 pts)
    • Was the time reasonably divided among team members? Was the presentation within the time limit?
  • Professionalism (2 pts)
    • Was the team prepared for the presentation? Did each team member have a meaningful contribution to the presentation?
  • Teamwork (2 pts)
    • Did the team present a unified story?
  • Slides (4 pts)
    • Are the slides well organized, readable, not full of text, featuring figures with legible labels, legends, etc.?
  • Content (6 pts): The content is presented in a clear and accurate way. This includes clearly and accurately describing
    • the primary research objective and intended audience for the article.
    • the data used in the analysis.
    • the observations and units at every level.
    • the model and primary conclusions
    • the effectiveness of tables, figures, and graphs in the article and argument of the contribution of the work (optional).

Grading criteria - Peers (4 pts)

This portion of the grade will the average of the scores from the peer reviewers. Click here for peer review assignments.

  • Introduction (1 pt)
    • Did the team clearly describe the primary research objective and intended audience for the article?
  • Data (1 pt)
    • Did the team clearly describe the data used in the analysis?
  • Model (1 pt)
    • Did the team clearly describe the observational units and variables at each level? Did they clearly describe the model?
  • Slides (1 pt)
    • Are the slides well organized, readable, not full of text, featuring figures with legible labels, legends, etc.?

GitHub repo organization

You should have the following files and folders in the project repo. The repo and brief summary in the README should be updated by Mon, Mar 28 at 11:59pm.

  • README.md: 3 - 5 sentence summary of the project

  • /proposal: Folder for project proposal

    • /proposal/proposal.Rmd: R Markdown file for proposal
    • /proposal/proposal.pdf: Knitted PDF of proposal
  • /article-evaluation/: Folder for article evaluation

    • /article-evaluation/article-evaluation.Rmd: R Markdown file for article evaluation
    • /article-evaluation/article-evaluation.pdf: Knitted PDF of article evaluation
  • /writeup/: Folder for write up

    • /writeup/writeup.Rmd: R Markdown file for write up
    • /writeup/writeup.pdf: Knitted PDF of write up
  • /presentation: Folder for presentation

    • /presentation/*: Presentation file (if not linked in README)
    • /presentation/README.md: Link to project (if not in presentation folder)

Optional - /data/: The data set - /data/*: File containing data set - /data/README.md: Codebook for data set.

Grading (100 points)

Component Points
Proposal 10 pts
Article evaluation 10 pts
Draft + Peer review 10 pts
Written report 35 pts
Presentation 20 pts
Organization 5 pts
Teamwork evaluation 10 pts

Tips on finding articles

Below are tips to help you find articles based on information from Jodi Psoter, the Librarian for Chemistry and Statistical Science at Duke Libraries.

PubMed

Articles in health-related fields

The PubMed heading tree lets you search by topic. The link will direct you to the results under the category of “Statistics as a Topic”.

  1. Click on the model / analysis of interest.
  2. Click “Add to search builder” under the PubMed Search Builder in the top right corner. You should now see the model/analysis type you chose in the search box.
  3. Click “Search PubMed”, and a page of search results will appear.
  4. There are options to narrow your results on the left-hand side. Under Article Attributes, check “Associated Data”, to limit the results to articles with data sets available.

You can use the other search options to narrow down results based on your team’s interests.

PsycInfo

Articles in psychology

PsycInfo will allow users to search by analysis type.

  1. Put the name of the model in the search bar, e.g., “Poisson Regression”. Then, in the drop down menu next to the search bar, select “DE Subjects [exact]”. Click Search.

  2. On the left-hand side, under Limit To, check “Open Access”. This will not guarantee the article has an associated data set, but a lot of open access articles will make the data available or utilize publicly accessible data you could pull from another source.

Web of Science

Articles on all topics

Web of Science Data Citation Index lets you search for data sets based on the topic of interest.

  1. Use the search bar to search based on a topic of interest. You can also search for the model / analysis type.

  2. On the left-hand side, check “Data Set” under Content Type and check “Dataset” under Data Types. Click “Refine” to limit the results.

3.Click on the article of interest.

  1. Click the DOI link in the article metadata. The article should have a data availability statement or something similar with information on accessing the data.

Example articles

Below are example articles you can use for the project. You are welcome to (but not required to) to use one of these articles. Up to two teams may use a given article for the project.

Acknowledgements

  • Grading criteria and the repo organization for this project were adapted from Project 1 on vizdata.org.
  • Some questions for the Article Evaluation adapted from “How to Evaluate Journal Articles”.
  • Questions in the “Communication” section of the Written Report are adapted from Communicating with Data: The Art of Writing for Data Science by Deborah Nolan and Sara Stoudt.