Research Projects



The Cognitive Diagnostic – Computerized Adaptive Testing for AP Statistics (AP-CAT) project is a five-year project funded by the National Science Foundation. The primary goal of this project is to create a CD-CAT system designed specifically to help students taking AP statistics to learn and master the course material. Aside from simply preparing students for their AP exam at the end of the school year, this project will assess the utility of the CD-CAT to determine if it is effective in increasing student engagement in statistics.

For more information, you can visit the project’s page.


Intelligent Diagnostic Assessment Platform (i-DAP) for High School Statistics Education

The i-DAP includes a series of assignments developed by content experts that each measures a number of key learning attributes that are mapped to the “statistics and probability” standards in the Common Core State Standards (CCSS). Students will receive formative feedback on their mastery of these attributes after completing an assignment. The i-DAP will also provide personalized learning materials that directly address their deficiencies. Teachers will have access to all individual diagnostic report, and group-level reports. Through these features, the i-DAP will help improve student engagement with statistics and in turn improve student learning of statistics.

For more information, you can visit the project's page.

Computational Social Science Research Experience for Undergraduates (REU) 

The Computational Social Science REU program is program where students will work collaboratively with expert mentors and select from a wide variety of computational social science projects at the University of Notre Dame. Computational social science as an approach to analyzing the social world is has been growing rapidly. An increasing number of social interactions are taking place in the virtual world, using social media, mobile phones, and other electronic means. The digital traces of such interactions and the greater availability and detail of CSS data sets (e.g. surveys, census data, historical records) yield and exponential growth in data available for analysis. New cyberinfrastructure tools and methodologies for data analytics are needed to capitalize on this resource and enhance American economic competitiveness. This REU training environment will develop multidisciplinary social scientists with the appropriate expertise to answer the computational social science data growth challenges and opportunities.

For more information, you can look at previous projects here.

Statistical Quality Control of Low-Stakes Assessment Data

This research project will develop statistical methods to monitor and control the quality of low-stakes assessment and questionnaire data. Many assessment tests and surveys given by researchers in the social and behavioral sciences are perceived as low stakes by participants, yet researchers rely heavily on such data to address their research questions. Data from low-stakes assessments may contain a substantial portion of inattentive responses, driven by carelessness or fatigue from participants, or sometimes from malicious attempts by survey-bots. The increasing popularity of online platforms for participant recruitment and data collection further exacerbates the problem. Through this project we will develop statistical methods to detect inattentive responses, benchmark the performance of these methods, and identify the best method for different types of inattentiveness. 

For more information about this project, please see here.

Testing Experiences of Students and Instructors during COVID-19.

Amidst the COVID-19 pandemic, students and educators struggled to adapt to online and remote instruction, with assessment being critical yet overlooked. While some high-stakes tests have been canceled, others (e.g., AP exams) were administered remotely. It is now important to identify major challenges undergraduate students and college instructors in the U.S. faced in the administration of assessments during this time. In particular, this project aims to support an understanding of whether experiences with testing differed based on student background factors (i.e., gender, racial/ethnic minority status, SES). Findings from the project may ultimately inform best practices to prepare for a possible future involving online and remote assessment.

A Systematic Review and Meta-analysis Examining Changes in Student Engagement during COVID-19

Test results from a nationally representative sample suggest a steep decline in student learning during and after the COVID-19 pandemic (NAEP, 2022). Though many factors likely contributed to the decline in students’ academic performance, decreased student engagement appears to be one significant contributing factor affecting college students (NSSE, 2021). We wanted to understand how student engagement changed during the pandemic by synthesizing relevant research around the world. We also intended to identify student (e.g., K-12 v. post-secondary) and context factors (e.g., country/region, etc.) contributing to changes in engagement.

We conducted a systematic review and meta-analysis, searching multiple databases (ProQuest, EBSCO, ERIC, PsycINFO, GoogleScholar) for relevant publications which documented students’ retrospective reports of their changes in engagement. We have so far screened 1552 articles, leaving a handful of studies (~20) with effect sizes (~100) eligible for inclusion. Effect sizes have been scaled and normalized with mean values on a 0 to 1 scale (0=no engagement, 1=full engagement). In addition to effect sies, we have also extracted additional information for moderation analyses from the included effect sizes, such as the age of participants, country, and measures and types of engagement. Effect sizes reflect affective behavioral, cognitive, social, and general engagement.

Preliminary meta-analyses using robust variance estimation to account for the clustering of effect sizes within studies revealed a significant decrease in engagement following the onset of the pandemic, suggesting a large effect. This effect did not appear to differ significantly based on the age of the samples, country, or type of engagement.

Understanding trends in student engagement, a malleable factor, has implications to support learning. Our findings suggest a significant decrease in students’ engagement due to the pandemic, which did not vary based on the age, country, or type of engagement. Such findings could be used to inform interventions to provide better instructional support in online and/or remote environments.

Project members: Emily Kane, Chessley B. Blacklock, Qizhou Duan, Teresa M. Ober, Ying Cheng

BiblioViz: A Platform to Examine Disparities in Scientific Research

Bibliometric analysis has been applied to various disciplines. It is particularly useful in recognizing the scientific progress in an emerging area where the scientific publications are voluminous and fragmented. Although relevant research has been conducted, there are two limitations yet to be addressed in the previous studies. The first one is related to accessibility and flexibility. Some of the current platforms, like CiteSpace (Chen, 2006), have managed to incorporate various visualization and analytic techniques in one platform for detecting emerging scientific trends. However, it is not open source, which not only limits accessibility but also the potential to be scaled up to a more versatile platform. The second limitation is related to fairness and equity. Most current research (Aria & Cuccurullo, 2017; Rose & Kitchin, 2019; Donthu et al., 2021) focuses merely on retrieving articles on certain databases automatically and presenting the research trends in a specific field. However, few of them evaluate disparities in research, such as imbalances in scientific research across regions, institutions, and researcher backgrounds like genders, for providing actionable insights and policy recommendations to close the research gaps. As such, situating in the field of learning analytics, this project aims to achieve to the following goals:

  1. Implementing an open source and interactive visual bibliometric analysis platform that can be used for all the research areas.
  2. Exploring statistical approaches for evaluating scientific research progress in various aspects to identify the research gaps.
  3. Investigating actionable policy recommendations for the research in the field.

Project Members: Bo Pei, Ying Cheng, Thomas Joyce

Gap Analysis: Investigating Student Performance Gaps in Fundamental Courses to Promote Academic Thriving

Learning performances in the fundamental courses have been recognized as the most significant determinants in student future academic thriving. Especially, with the influences of COVID-19 pandemic, there have been changes in students’ learning formats, patterns, and preferences. The topics regarding (1) Whether the traditional instructional pedagogies in the fundamental courses can meet the current adaptive requirements in learning, (2) What are the most inferential factors to the changes in students’ learning performance, and (3) How do we provide instructional interventions in a more targeted, timely and trustworthiness manner, are still underexplored.

In our project, focusing on student learning data in the fundamental courses offered in multiple departments (i.e., Economics, Mathematics, Chemistry and Biochemistry) at the University of Notre Dame, we will build and adopt machine learning approaches to specifically analyze and identify the key factors correlate to student learning performance. And combining with visualization techniques, we will also explore approaches that can support instructors to provide contextualized and trustworthy interventions for students. Ultimately, we will build a holistic and robust analytical framework for modeling, predicting, identifying, and supporting student learning at the fine-grained topic level.

Project Members: Bo Pei, Ying Cheng, Thomas Joyce, G. Alex Ambrose, John Behrens, Eva Dziadula, Brain Mulholland, and Shawn Miller

Sequential Item Response Models for Multiple-Choice, Multiple-Attempt Test Items (SIRT-MM)

SIRT-MM models are a type of statistical model used for modeling multiple-attempt or answer-until-correct (AUC) procedures, which allow test takers to keep submitting answers to multiple-choice questions until they get the correct response. This approach offers two benefits: 1) it helps gather more information from test takers, which improves the accuracy of the results, and 2) it provides feedback to test takers, supporting their learning process. SIRT-MM models can be applied to various testing scenarios, including cognitive diagnostic assessments and computerized adaptive testing.

Project Members: Yikai Lu, Alison Cheng