Chair: Timo von Oetzen
As modern technology and new methods enabled researchers to collect large and dense longitudinal or other multi-level data sets, the need for advanced methods to analyze such rich data sets increased in parallel. Today, multiple modern techniques exist, but the dissemination of those is remarkably behind; frequentist, discrete-time, or even time-regression methods still prevail, which developed as extension of methods originally designed for small samples and one or two variables. This symposium aims to introduce some flashlights of those modern methods and where quantitative psychology has new methods to offer for complex structured data set, in particular for change over time.
Tree-Based Methods for Detecting Violations of Measurement Invariance in Psychological Assessments
Rudolf Debelak & Carolin Strobl
A common question in the application of psychological assessments is whether the psychometric characteristics of the items, such as their difficulty or their discrimination, remain invariant over all subgroups of the population.
Technically, this pertains to the questions whether any test items show differential item functioning (DIF). This talk presents an overview of a flexible toolbox of methods for detecting DIF in an Item Response Theory
(IRT) framework by means of recursive partitioning. The proposed method tests the stability of the item parameters with regard to person covariates of any type (such as age, gender or educational level). If an instability is detected, the method proposes subsamples for which the item parameters are stable. The method is freely available in R and can be easily applied to empirical data. Its application is presented for dichotomous and polytomous Rasch models, and current work in progress on generalizations to other IRT models, in particular the 2PL model and the Generalized Partial Credit Model, are discussed.
Variable Selection with Regularized Structural Equation Models
Andreas M. Brandmaier | Max Planck Institute for Human Development & Max Planck UCL Centre for Computational Psychiatry and Ageing Research
Psychological research has seen a rapid increase in the amount of collected data. With large numbers of variables in a study, researchers often want to complement hypothesis-driven approaches with exploring which variables are most informative in explaining observed variability in the outcomes of interest. Typical questions asked are: “What is the set of most important variables for predicting the outcome of interest?î. Regularization offers a principled approach to research questions involving variable selection by optimizing predictive accuracy while favoring parsimonious models. Regularized Structural Equation Models (SEM) take this approach to the latent level by incorporating the strengths of regularization into the multivariate SEM framework by introducing a penalized likelihood function. This extension allows researchers to estimate sparse models and to implicitly solve large-scale variable selection in latent variable models.
How the Flat Prior is a Simple Bridge from Frequentist to Bayesian Methods
Timo von Oertzen
Abstract: Bayesian techniques are great tools not only because everyone thinks they are fancy, but because results are easier to understand than the double-negations of frequentist result. However, since we all learned those techniques and use them in a ritualized way, it’s difficult to replace them by Bayesian techniques. Especially the best choice of a pior is difficult. In this presentation, we’ll show that a flat prior can help to transfer oneself from a frequentist to a bayesian method-user without creating substantial new interpretation difficulties, and how this can be done with very classical frequentist techniques.
Assessment of Differential Item Functioning and its Consequences in Large Scale Assessment
Carsten Szardenings, Anna Doebler, Philipp Doebler
Differential item functioning (DIF) means that an item’s response function depends on the population. Since observations alone do not identify the difficulty, the diagnosis of DIF requires knowledge about the differential levels of the latent variable between populations, which is usually obtained through anchor items. Recent approaches test for, respectively aim to quantify DIF %, or rather an analogue concept thereof, on test level, which eliminate the above requirement.
We elaborate on these approaches and investigated consequences of varying levels and forms of DIF on analyses common in educational research. We analyzed competence measures in the National Educational Panel Study (NEPS) regarding DIF on test level and its consequences.
Learning from miss-classified categorical data that has nonresponse under prior ignorance: How can Imprecise Probability help?
Nonresponse is an unavoidable problem that is often encountered in survey research, no matter which survey tools are implemented or how efficient the survey is administered. Remedies for nonresponse often depend on some implied assumptions regarding the response process. It is a common practice to assume that the missing occurs at random or completely at random, although this assumption is not testable. In situations where the randomness of the response process is highly doubted, there is a need to model how response is correlated with the phenomena under investigation. Typically, this modelling step imposes certain assumptions on the response process. A relatively new area of research known as “Imprecise Statistics” provides an assumption-free approach to handle nonresponse by moving from a point-identifiable solution to an interval-identifiable solution. This solution is further enhanced by utilizing available contextual knowledge. A similar approach can be followed to additionally tackle the problem of miss-classification which arises, for example, when using a categorical single item as a manifest for a corresponding categorical latent construct. Both nonresponse and miss-classification present sources of uncertainty in regard to the underlying studied variable. Yet another uncertainty arises when no prior knowledge of the studied variable exists. Such prior ignorance is usually addressed through the employment of vague priors under the Bayesian framework. Here as well, Imprecise Statistics suggests the use of the Imprecise Dirichlet model as a replacement to the usual vague priors to overcome the deficiencies associated with them. In this talk, the benefits of these relatively new approaches are to be illustrated and discussed.
Bayesian Multilevel Analysis Using Flat Priors
With increasing accessibility of suitable and user-friendly software, multilevel analyses have become more common over the last years. Given that a nested data structure is present in a wide range of situations, this development is to be seen as a desirable process, depicting dependencies within a dataset more accurately. What is currently still lacking from this field is an easy option how Bayesian statistics can be computed within a multilevel model. One such option is to use a flat prior. The present work will describe how a flat prior can be used within a multilevel model in two ways. Firstly, flat priors are applied to multilevel analyses using a visual model representation based on the program Onyx. Secondly, flat priors are applied to multilevel analyses using an equation-based representation based on the lme4 package for R. This second option requires a sufficient number of observations to yield reliable results. Concrete guidelines are given for the use of Bayesian multilevel analysis with smaller datasets based on multiple simulations. The PISA dataset is used as a sample application of these two methods for using flat priors as a mean to conduct Bayesian multilevel analysis.