Psychometrics is concerned with theory and techniques of psychological measurement. Psychometricians have also worked collaboratively with those in the field of statistics and quantitative methods to develop improved ways to organize, analyze, and scale corresponding data. Since much functionality is already contained in base R and there is considerable overlap between tools for psychometry and tools described in other views, particularly in SocialSciences
, we only give a brief overview of packages that are closely related to psychometric methodology.
Please let me know if I have omitted something of importance, or if a new package or function should be mentioned here.
Item Response Theory (IRT):
- The eRm package fits extended Rasch models, i.e. the ordinary Rasch model for dichotomous data (RM), the linear logistic test model (LLTM), the rating scale model (RSM) and its linear extension (LRSM), the partial credit model (PCM) and its linear extension (LPCM) using conditional ML estimation. Missing values are allowed.
- The package ltm also fits the simple RM. Additionally, functions for estimating Birnbaum's 2- and 3-parameter models based on a marginal ML approach are implemented as well as the graded response model for polytomous data, and the linear multidimensional logistic model.
- The mirt estimates dichotomous and polytomous response data using unidimensional and multidimensional latent trait models under the IRT paradigm. Exploratory and confirmatory models can be estimated with quadrature (EM) or stochastic (MHRM) methods. Confirmatory bi-factor and two-tier analyses are available for modeling item testlets. Multiple group analysis and mixed effects designs also are available for detecting differential item functioning and modeling item and person covariates.
- TAM fits unidimensional and multidimensional item response models and also includes multifaceted models, latent regression models and options for drawing plausible values.
- PLmixed fits (generalized) linear mixed models (GLMM) with factor structures.
- MLCIRTwithin provides a flexible framework for the estimation of discrete two-tier IRT models for the analysis of dichotomous and ordinal polytomous item responses.
- IRTShiny provides an interactive shiny application for IRT analysis.
- Some additional uni- and multidimensional item response models (especially for locally dependent item responses) and some exploratory methods (DETECT, LSDM, model-based reliability) are included in sirt.
- The pcIRT estimates the multidimensional polytomous Rasch model and the Mueller's continuous rating scale model.
- Thurstonian IRT models can be fitted with the kcirt package.
- MultiLCIRT estimates IRT models under (1) multidimensionality assumption, (2) discreteness of latent traits, (3) binary and ordinal polytomous items.
- Conditional maximum likelihood estimation via the EM algorithm and information-criterion-based model selection in binary mixed Rasch models are implemented in the mRm package and the psychomix package. The mixRasch package estimates mixture Rasch models, including the dichotomous Rasch model, the rating scale model, and the partial credit model.
- The PP package includes estimation of (MLE, WLE, MAP, EAP, ROBUST) person parameters for the 1,2,3,4-PL model and the GPCM (generalized partial credit model). The parameters are estimated under the assumption that the item parameters are known and fixed. The package is useful e.g. in the case that items from an item pool/item bank with known item parameters are administered to a new population of test-takers and an ability estimation for every test-taker is needed.
- The equateIRT package computes direct, chain and average (bisector) equating coefficients with standard errors using Item Response Theory (IRT) methods for dichotomous items.
- kequate implements the kernel method of test equating using the CB, EG, SG, NEAT CE/PSE and NEC designs, supporting gaussian, logistic and uniform kernels and unsmoothed and pre-smoothed input data.
- SNSequate provides several methods for test equating. Besides of traditional approaches (mean-mean, mean-sigma, Haebara and Stocking-Lord IRT, etc.) it supports methods such that local equating, kernel equating (using Gaussian, logistic and uniform kernels), and IRT parameter linking methods based on asymmetric item characteristic functions including functions for obtaining standard errors.
- The EstCRM package calibrates the parameters for Samejima's Continuous IRT Model via EM algorithm and Maximum Likelihood. It allows to compute item fit residual statistics, to draw empirical 3D item category response curves, to draw theoretical 3D item category response curves, and to generate data under the CRM for simulation studies.
- The difR package contains several traditional methods to detect DIF in dichotomously scored items. Both uniform and non-uniform DIF effects can be detected, with methods relying upon item response models or not. Some methods deal with more than one focal group.
- The package lordif provides a logistic regression framework for detecting various types of differential item functioning (DIF).
- DIFlasso implements a penalty approach to differential item functioning in Rasch models. It can handle settings with multiple (metric) covariates.
- A set of functions to perform Raju, van der Linden and Fleer's (1995) differential item and item functioning analyses is implemented in the DFIT package. It includes functions to use the Monte Carlo item parameter replication (IPR) approach for obtaining the associated statistical significance tests cut-off points.
- The difNLR package uses nonlinear regression to estimate DIF.
- The catR package allows for computarized adaptive testing using IRT methods.
- The mirtCAT package provides tools to generate an HTML interface for creating adaptive and non-adaptive educational and psychological tests using the shiny package. Suitable for applying unidimensional and multidimensional computerized adaptive tests using IRT methodology and for creating simple questionnaires forms to collect response data directly in R.
- xxIRT is implementation of related to IRT and computer-based testing.
- The package plRasch computes maximum likelihood estimates and pseudo-likelihood estimates of parameters of Rasch models for polytomous (or dichotomous) items and multiple (or single) latent traits. Robust standard errors for the pseudo-likelihood estimates are also computed.
- Explicit calculation (not estimation) of Rasch item parameters (dichotomous and polytomous) by means of a pairwise comparison approach can be done using the pairwise package.
- A multilevel Rasch model can be estimated using the package lme4, nlme, and MCMCglmm with functions for mixed-effects models with crossed or partially crossed random effects. The ordinal package implements this approach for polytomous models. An infrastructure for estimating tree-structured item response models of the GLMM family using lme4 is provided in irtrees.
- Nonparametric IRT analysis can be computed by means if the mokken package. It includes an automated item selection algorithm, and various checks of model assumptions. In relation to that, fwdmsa performs the Forward Search for Mokken scale analysis. It detects outliers, it produces several types of diagnostic plots.
- This KernSmoothIRT package fits nonparametric item and option characteristic curves using kernel smoothing. It allows for optimal selection of the smoothing bandwidth using cross-validation and a variety of exploratory plotting tools.
- The RaschSampler allows the construction of exact Rasch model tests by generating random zero-one matrices with given marginals.
- Statistical power simulation for testing the Rasch model based on a three-way ANOVA design with mixed classification can be carried out using pwrRasch.
- The irtProb package is designed to estimate multidimensional subject parameters (MLE and MAP) such as personal pseudo-guessing, personal fluctuation, personal inattention. These supplemental parameters can be used to assess person fit, to identify misfit type, to generate misfitting response patterns, or to make correction while estimating the proficiency level considering potential misfit at the same time.
- cacIRT computes classification accuracy and consistency under Item Response Theory. Implements total score and latent trait IRT methods as well as total score kernel-smoothed methods.
- The package irtoys provides a simple common interface to the estimation of item parameters in IRT models for binary responses with three different programs (ICL, BILOG-MG, and ltm, and a variety of functions useful with IRT models.
- The CDM estimates several cognitive diagnosis models (DINA, DINO, GDINA, RRUM, LCDM, pGDINA, mcDINA), the general diagnostic model (GDM) and structured latent class analysis (SLCA).
- Gaussian ordination, related to logistic IRT and also approximated as maximum likelihood estimation through canonical correspondence analysis is implemented in various forms in the package VGAM.
- LNIRT can be used for log-normal response time IRT models.
- emIRT provides various EM-algorithms IRT models (binary and ordinal responses, along with dynamic and hierarchical models).
- immer implements some item response models for multiple ratings, including the hierarchical rater model and a wrapper function to the commercial FACETS program.
- An Rcpp based implementation of a variety of IRT models is provided by IRTpp.
- The latdiag package produces commands to drive the dot program from graphviz to produce a graph useful in deciding whether a set of binary items might have a latent scale with non-crossing ICCs.
- The purpose of the rpf package is to factor out logic and math common to IRT fitting, diagnostics, and analysis. It is envisioned as core support code suitable for more specialized IRT packages to build upon.
- The classify package can be used to examine classification accuracy and consistency under IRT models.
- WrightMap provides graphical tools for plotting item-person maps.
- irtDemo includes a collection of shiny applications to demonstrate or to explore fundamental IRT concepts.
- IRT utility functions described in the Baker/Kim book are included in birtr.
Correspondence Analysis (CA), Optimal Scaling:
- The package ca comprises two parts, one for simple correspondence analysis and one for multiple and joint correspondence analysis.
- Simple and canonical CA are provided by the package anacor, including confidence ellipsoids. It allows for different scaling methods such as standard scaling, Benzecri scaling, centroid scaling, and Goodman scaling.
- A GUI (Windows only) that allows the user to construct interactive Biplots is offered by the package BiplotGUI.
- Homogeneity analysis aka multiple CA and various Gifi extensions can be computed by means of the homals package. Hull plots, span plots, Voronoi plots, star plots, projection plots and many others can be produced.
- Simple and multiple correspondence analysis can be performed using
mca() in package MASS.
- The package ade4 contains an extensive set of functions covering, e.g., principal components, simple and multiple, fuzzy, non symmetric, and decentered correspondence analysis. Additional functionality is provided at Bioconductor in the package
made4 (see also here).
- The package cocorresp fits predictive and symmetric co-correspondence analysis (CoCA) models to relate one data matrix to another data matrix.
- Apart from several factor analytic methods FactoMineR performs CA including supplementary row and/or column points and multiple correspondence analysis (MCA) with supplementary individuals, supplementary quantitative variables and supplementary qualitative variables.
- Package vegan supports all basic ordination methods, including non-metric multidimensional scaling. The constrained ordination methods include constrained analysis of proximities, redundancy analysis, and constrained (canonical) and partially constrained correspondence analysis.
- cabootcrs computes bootstrap confidence regions for CA.
- cncaGUI implements a GUI with which users can construct and interact with canonical (non-symmetrical) CA.
- SVD based multivariate exploratory methods such as PCA, CA, MCA (as well as a Hellinger form of CA), generalized PCA are implemented in ExPosition. The package also allows for supplementary data projection.
- cds can be used for constrained dual scaling for detecting response styles.
- CAvariants provides six variants of two-way CA: simple, singly ordered, doubly ordered, non-symmetrical, singly ordered non-symmetrical ca, and doubly ordered non-symmetrical.
- MCAvariants provides MCA and ordered MCA via orthogonal polynomials.
- Specific and class specific MCA on survey-like data can be fitted using soc.ca.
- optiscale provides tools for performing an optimal scaling transformation on a data vector.
- A general framework of optimal scaling methods is implemented in the aspect.
Factor Analysis (FA), Principal Component Analysis (PCA):
- Exploratory FA is the package stats as function
fa.poly() (ordinal data) in psych.
- esaBcv estimates the number of latent factors and factor matrix.
- SparseFactorAnalysis scales count and binary data with sparse FA.
- EFAutilities computes robust standard errors and factor correlations under a variety of conditions.
- faoutlier implements influential case detection methods for FA and SEM.
- The package psych includes functions such as
VSS() for estimating the appropriate number of factors/components as well as
ICLUST() for item clustering.
- PCA can be fitted with
prcomp() (based on
svd(), preferred) as well as
princomp() (based on
eigen() for compatibility with S-PLUS). Additional rotation methods for FA based on gradient projection algorithms can be found in the package GPArotation. The package nFactors produces a non-graphical solution to the Cattell scree test. Some graphical PCA representations can be found in the psy package. paran implements Horn's test of principal components/factors.
- FA and PCA with supplementary individuals and supplementary quantitative/qualitative variables can be performed using the FactoMineR package whereas MCMCpack has some options for sampling from the posterior for ordinal and mixed factor models.
- The homals package provides nonlinear PCA (aka categorical PCA) and, by defining sets, nonlinear canonical correlation analysis (models of the Gifi-family).
- nsprcomp and elasticnet fit sparse PCA.
- Threeway PCA models (Tucker, Parafac/Candecomp) can be fitted using PTAk, ThreeWay, and multiway.
- Independent component analysis (ICA) can be computed using fastICA, ica, eegkit (designed for EEG data), and AnalyzeFMRI (designed for fMRI data).
- A desired number of robust principal components can be computed with the pcaPP package.
- bpca implements 2D and 3D biplots of multivariate data based on PCA and diagnostic tools of the quality of the reduction.
- missMDA provides imputation of incomplete continuous or categorical datasets in principal component analysis (PCA), multiple correspondence analysis (MCA) model, or multiple factor analysis (MFA) model.
Structural Equation Models (SEM):
- The package lavaan can be used to estimate a large variety of multivariate statistical models, including path analysis, confirmatory factor analysis, structural equation modeling and growth curve models. It includes the lavaan model syntax which allows users to express their models in a compact way and allows for ML, GLS, WLS, robust ML using Satorra-Bentler corrections, and FIML for data with missing values. It fully supports for meanstructures and multiple groups and reports standardized solutions, fit measures, modification indices and more as output.
- The OpenMx package allows estimation of a wide variety of advanced multivariate statistical models. It consists of a library of functions and optimizers that allow you to quickly and flexibly define an SEM model and estimate parameters given observed data.
- The sem package fits general (i.e., latent-variable) SEMs by FIML, and structural equations in observed-variable models by 2SLS. Categorical variables in SEMs can be accommodated via the polycor package.
- The lavaan.survey package allows for complex survey structural equation modeling (SEM). It fits structural equation models (SEM) including factor analysis, multivariate regression models with latent variables and many other latent variable models while correcting estimates, standard errors, and chi-square-derived fit measures for a complex sampling design. It incorporates clustering, stratification, sampling weights, and finite population corrections into a SEM analysis.
- The nlsem package fits nonlinear structural equation mixture models using the EM algorithm. Three different approaches are implemented: LMS (Latent Moderated Structural Equations), SEMM (Structural Equation Mixture Models), and NSEMM (Nonlinear Structural Equations Mixture Models).
- A collection of functions for conducting meta-analysis using a structural equation modeling (SEM) approach via OpenMx is provided by the metaSEM package.
- A general implementation of a computational framework for latent variable models (including structural equation models) is given in lava. The lava.tobit package generalizes the framework to censored and dichotomous variables via a probit link formulation.
- The pls package can be used for partial least-squares estimation. The package semPLS fits structural equation models using partial least squares (PLS). The PLS approach is referred to as soft-modeling technique requiring no distributional assumptions on the observed data. PLS methods with emphasis on structural equation models with latent variables are given in plspm which also includes pathmox as a companion package with approaches of segmentation trees in PLS path modeling.
- simsem is a package designed to aid in Monte Carlo simulations using SEM (for methodological investigations, power analyses and much more).
- Sim.DiffProc provides a framework for parallelized Monte Carlo simulation-estimation in multidimensional continuous-time models, which have been implemented as SEM.
- semTools is a package of add on functions that can aid in fitting SEMs in R (for example one function automates imputing missing data, running imputed datasets and combining the results from these datasets).
- semPlot produces path diagrams and visual analysis for outputs of various SEM packages.
- plotSEMM for graphing nonlinear relations among latent variables from structural equation mixture models.
- SEMModComp conducts tests of difference in fit for mean and covariance structure models as in SEM.
- semdiag and influence.SEM implements outlier, leverage diagnostics, and case influence for SEM, whereas semGOF provides SEM goodness-of-fit indexes.
- ctsem fits continuous time SEM using linear stochastic differential equations.
- gSEM conducts semi-supervised generalized SEM and piecewiseSEM fits piecewise SEM.
- rsem implements robust SEM with missing data and auxiliary variables.
- regsem performs Regularization on SEM and sparseSEM implements sparse-aware ML for SEM.
- Recursive partitioning (SEM trees, SEM forests) is implemented in semtree.
- BigSEM constructs large systems of structural equations using a two-stage penalized least squares approach.
- Identifiability of linear SEM can be checked using SEMID.
- lsl conducts SEM via penalized likelihood (latent structure learning).
- MIIVsem contains functions for estimating structural equation models using instrumental variables.
- The systemfit package implements a wider variety of estimators for observed-variables models, including nonlinear simultaneous-equations models.
- An interface between the EQS software for SEM and R is provided by the REQS package.
- The MplusAutomation package allows to automate latent variable model estimation and interpretation using Mplus.
Multidimensional Scaling (MDS):
- The smacof package provides many approaches to metric and nonmetric MDS, including extensions for MDS with external constraints, spherical MDS, asymmetric MDS, three-way MDS (INDSCAL/IDIOSCAL), Bentler-Weeks model, unidimensional scaling, Procrustes, inverse MDS.
- MASS and stats provide functionalities for computing classical MDS using the
cmdscale() function. Sammon mapping
sammon() and non-metric MDS
isoMDS() are other relevant functions.
- Nonmetric MDS can also be computed with
metaMDS() in vegan. Furthermore, labdsv and ecodist provide the function
nmds() and some routines can be found in xgobi. Also, the ExPosition implements a function for metric MDS.
- Principal coordinate analysis can be computed with
capscale() in vegan; in labdsv and ecodist using
pco() and with
dudi.pco() in ade4.
- INDSCAL is also implemented in the SensoMineR package.
- The package MLDS allows for the computation of maximum likelihood difference scaling (MLDS).
- DistatisR implements the DiSTATIS/CovSTATIS 3-way metric MDS approach.
- Symbolic MDS for interval-valued dissimilarities (hypersphere and hyperbox model) can be fitted with the smds package.
- SOD (Self-Organising-Deltoids) provides MDS by gradually reducing the dimensionality of an initial space.
- Supervised MDS is implemented in superMDS.
- munfold provides functions for metric unfolding.
- The asymmetry package implements the slide-vector model for asymmetric MDS.
- semds fits asymmetric and three-way MDS within an SEM framework.
Classical Test Theory (CTT):
- The CTT package can be used to perform a variety of tasks and analyses associated with classical test theory: score multiple-choice responses, perform reliability analyses, conduct item analyses, and transform scores onto different scales.
- Functions for correlation theory, meta-analysis (validity generalization), reliability, item analysis, inter-rater reliability, and classical utility are contained in the psychometric package.
- An interactive shiny application for CTT is provided by CTTShiny.
- The cocron package provides functions to statistically compare two or more alpha coefficients based on either dependent or independent groups of individuals.
- The CMC package calculates and plots the step-by-step Cronbach-Mesbach curve, that is a method, based on the Cronbach alpha coefficient of reliability, for checking the unidimensionality of a measurement scale.
- Cronbach alpha, kappa coefficients, and intra-class correlation coefficients (ICC) can be found in the psy package. Functions for ICC computation can be also found in the packages psych, psychometricand ICC.
- A number of routines for scale construction and reliability analysis useful for personality and experimental psychology are contained in the package psych.
- subscore can be used for computing subscores in CTT and IRT.
Knowledge Structure Analysis:
- DAKS provides functions and example datasets for the psychometric theory of knowledge spaces. This package implements data analysis methods and procedures for simulating data and transforming different formulations in knowledge space theory.
- The kst package contains basic functionality to generate, handle, and manipulate deterministic knowledge structures based on sets and relations. Functions for fitting probabilistic knowledge structures are included in the pks package.
Latent Class Analysis (LCA):
- LCA with random effects can be performed with the package randomLCA. In addition, the package e1071 provides the function
lca(). Another package is poLCA for polytomous variable latent class analysis. LCA can also be fitted using flexmix which optionally allows for the inclusion of concomitant variables and latent class regression.
- LCAvarsel implements variable selection for LCA.
- covLCA fits latent class models with covariate effects on underlying and measured variables.
- lcda fits latent class discriminant analysis.
- ClustVarLV clusters variables around latent variables.
- Bradley-Terry models for paired comparisons are implemented in the package BradleyTerry2 and in eba. The latter allows for the computation of elimination-by-aspects models.
- Recursive partitioning trees for Bradley-Terry models are implemented in psychotree.
- BTLLasso allows one to include subject-specific and object-specific covariates into paired comparison models shrinks the effects using Lasso.
- prefmod fits loglinear Bradley-Terry models (LLBT) and pattern models for paired comparisons, rankings, and ratings.
- blavaan fits a variety of Bayesian latent variable models, including confirmatory factor analysis, structural equation models, and latent growth curve models.
- BayesFM computes Bayesian exploratory factor analysis. The number of factors is determined during MCMC sampling.
- Bayesian approaches for estimating item and person parameters by means of Gibbs-Sampling are included in MCMCpack. In addition, the pscl package allows for Bayesian IRT and roll call analysis.
- cIRT stands for choice IRT and jointly models the accuracy of cognitive responses and item choices within a Bayesian hierarchical framework.
- edstan provides convenience functions and preprogrammed Stan models related to IRT.
- fourPNO can be used for Bayesian 4-PL IRT estimation.
- Simulation-based Bayesian inference for IRT latent traits can be performed using ltbayes.
- BayesLCA implements Bayesian LCA.
Other Related Packages:
- The psychotools provides an infrastructure for psychometric modeling such as data classes (e.g., for paired comparisons) and basic model fitting functions (e.g., for Rasch and Bradley-Terry models).
- quickpsy is a package developed to quickly fit and plot psychometric functions for multiple conditions.
- A system for the management, assessment, and psychometric analysis of data from educational and psychological tests is implemented in dexter.
- Psychometric mixture models based on flexmix infrastructure are provided by means of the psychomix package (at the moment Rasch mixture models and Bradley-Terry mixture models).
- The equate package contains functions for non-IRT equating under both random groups and nonequivalent groups with anchor test designs. Mean, linear, equipercentile and circle-arc equating are supported, as are methods for univariate and bivariate presmoothing of score distributions. Specific equating methods currently supported include Tucker, Levine observed score, Levine true score, Braun/Holland, frequency estimation, and chained equating.
- The CopyDetect package contains several IRT and non-IRT based statistical indices proposed in the literature for detecting answer copying on multiple-choice examinations.
- Interactive shiny application for analysis of educational tests and their items are provided by the ShinyItemAnalysis package.
- Coefficients for interrater reliability and agreements can be computed with the irr.
- Psychophysical data can be analyzed with the psyphy package.
- Functions and example datasets for Fechnerian scaling of discrete object sets are provided by fechner. It computes Fechnerian distances among objects representing subjective dissimilarities, and other related information.
- The modelfree package provides functions for nonparametric estimation of a psychometric function and for estimation of a derived threshold and slope, and their standard deviations and confidence intervals.
- Confidence intervals for standardized effect sizes: The MBESS package.
- The mediation allows both parametric and nonparametric causal mediation analysis. It also allows researchers to conduct sensitivity analysis for certain parametric models.
- Mediation analysis using natural effect models can be performed using medflex.
- Functions for data screening, testing moderation, mediation, and estimating power are contained in the QuantPsyc package.
- The package multiplex is especially designed for social networks with relations at different levels. In this sense, the program has effective ways to treat multiple networks data sets with routines that combine algebraic structures like the partially ordered semigroup with the existing relational bundles found in multiple networks. An algebraic approach for two-mode networks is made through Galois derivations between families of the pair of subsets.
- The qgraph package can be used to visualize data as networks.
- Social Relations Analyses for round robin designs are implemented in the TripleR package. It implements all functionality of the SOREMO software, and provides new functions like the handling of missing values, significance tests for single groups, or the calculation of the self enhancement index.
- Fitting and testing multinomial processing tree models, a class of statistical models for categorical data with latent parameters, can be performed using the mpt package. These parameters are the link probabilities of a tree-like graph and represent the cognitive processing steps executed to arrive at observable response categories.The MPTinR package provides a user-friendly way for analysis of multinomial processing tree (MPT) models.
- Beta regression for modeling beta-distributed dependent variables, e.g., rates and proportions, is available in betareg.
- The cocor package provides functions to compare two correlations based on either dependent or independent groups.
- The profileR package provides a set of tools that implement profile analysis and cross-validation techniques.
- The TestScorer package provides a GUI for entering test items and obtaining raw and transformed scores. The results are shown on the console and can be saved to a tabular text file for further statistical analysis. The user can define his own tests and scoring procedures through a GUI.
- wCorr calculates Pearson, Spearman, tetrachoric polychoric, and polyserial correlation coefficients, in weighted or unweighted form.
- The gtheory package fits univariate and multivariate generalizability theory (G-theory) models.
- The GDINA package estimates various cognitive diagnosis models (CDMs) within the generalized deterministic inputs, noisy and gate (G-DINA) model and the sequential G-DINA model framework. It can also be used to conduct Q-matrix validation, item and model fit statistics, model comparison at the test and item level and differential item functioning. A graphical user interface is also provided.
- TestDataImputation for missing item responses imputation for test and assessment data.
- lba performs latent budget analysis for compositional data (two-way contingency table with an exploratory variable and a response variable)
- fuzzyreg implements multiple methods to fit fuzzy linear regression. Models using fuzzy set theory are suitable for analysis of trait data in situations when the model is indefinite, relationships between model variables are vague, sample size is low or measurements are hierarchically structured.