Browsing by Author "Pemstein, Daniel"

Now showing 1 - 10 of 10

Democracy Promotion and Electoral Quality: A Disaggregated Analysis
(2020-09) Steele, Carie A.; Pemstein, Daniel; Meserve, Stephen A.; V-Dem Institute
The international community spends significant sums of money on democracy promotion and support, focusing especially on producing competitive and transparent electoral environments in the developing world. In theory, this aid empowers a variety of actors, increasing competition and government responsiveness. Most efforts to determine the effect of democracy aid, however, have focused on how aggregate measures of democracy, and broad concepts, such as electoral competition or media freedom, respond to aid efforts. We argue that to fully understand the effect of aid on democratization one must consider how democracy aid affects specific country institutions. Building on theory from the democratization and democracy promotion literature, we specify more precise causal linkages between democracy assistance and electoral quality. Specifically, we hypothesize about the effects of democracy aid on the implementation and quality of elections. We test these hypotheses using V-Dem's detailed elections measures, using Finkel, Perez-Linan & Seligson's (2007) data and modeling strategy, to examine the impact of democracy aid. Intriguingly, we find that there is no consistent relationship between democracy and governance aid and the improvement of specific micro-level indicators of democratic institutions or election quality, but that aggregate measures still capture a relationship between aid and democracy. We then investigate the possibility that empirical relationships between aid and democracy may not re aid-induced democratization, but may instead reflect investments countries make in regimes they observe democratizing.
Estimating Latent Traits from Expert Surveys: An Analysis of Sensitivity to Data Generating Process
(2018) Marquardt, Kyle L.; Pemstein, Daniel; V-Dem Institute
Models for converting expert-coded data to point estimates of latent concepts assume different data-generating processes. In this paper, we simulate ecologically-valid data according to different assumptions, and examine the degree to which common methods for aggregating expert-coded data can recover true values and construct appropriate coverage intervals from these data. We find that hierarchical latent variable models and the bootstrapped mean perform similarly when variation in reliability and scale perception is low; latent variable techniques outperform the mean when variation is high. Hierarchical A-M and IRT models generally perform similarly, though IRT models are often more likely to include true values within their coverage intervals. The median and non-hierarchical latent variable modeling techniques perform poorly under most assumed data generating processes.
Evaluating and Improving Item Response Theory Models for Cross-National Expert Surveys
(University of Gothenburg, 2015) Pemstein, Daniel; Tzelgov, Eitan; Wang, Yi-ting; V-Dem Institute
The data produced by the Varieties of Democracy (V-Dem) project contains ordinal ratings of a multitude of country-level indicators across space and time, with multiple experts providing judgments for each country-year observation. We use an ordinal item response theory (O-IRT) model to aggregate multiple experts' ratings. The V-Dem data provide a challenging domain for such models because they exhibit little cross-national bridging. That is, few coders provide ratings for multiple countries, making it difficult to calibrate the scales of estimates cross-nationally. In this paper, we provide a systematic analysis of the issue of bridging. We first use simulations to explore how much bridging one needs to achieve scale identification when coders' thresholds vary across countries and when the latent traits of some countries lack variation. We then examine how posterior predictive checks can be used to check cases of extent of scale non-comparability. Finally, we develop and evaluate search algorithms designed to select bridges that are most likely allow one to correct scale incompatibility problems.
Experts, Coders, and Crowds: An analysis of substitutability
(2017) Marquardt, Kyle L.; Pemstein, Daniel; Sanhueza Petrarca, Constanza; Seim, Brigitte; Wilson, Steven Lloyd; Bernhard, Michael; Coppedge, Michael; Lindberg, Staffan I.; V-Dem Institute
Recent work suggests that crowd workers can replace experts and trained coders in common coding tasks. However, while many political science applications require coders to both and relevant information and provide judgment, current studies focus on a limited domain in which experts provide text for crowd workers to code. To address potential over-generalization, we introduce a typology of data producing actors - experts, coders, and crowds - and hypothesize factors which affect crowd-expert substitutability. We use this typology to guide a comparison of data from crowdsourced and expert surveys. Our results provide sharp scope conditions for the substitutability of crowd workers: when coding tasks require contextual and conceptual knowledge, crowds produce substantively dierent data from coders and experts. We also find that crowd workers can cost more than experts in the context of cross-national panels, and that one purported advantage of crowdsourcing - replicability - is undercut by an insucient number of crowd workers.
IRT models for expert-coded panel data
(2017) Marquardt, Kyle L.; Pemstein, Daniel; V-Dem Institute
Data sets quantifying phenomena of social-scientific interest often use multiple experts to code latent concepts. While it remains standard practice to report the average score across experts, experts likely vary in both their expertise and their interpretation of question scales. As a result, the mean may be an inaccurate statistic. Item-response theory (IRT) models provide an intuitive method for taking these forms of expert disagreement into account when aggregating ordinal ratings produced by experts, but they have rarely been applied to cross- national expert-coded panel data. In this article, we investigate the utility of IRT models for aggregating expert-coded data by comparing the performance of various IRT models to the standard practice of reporting average expert codes, using both real and simulated data. Specifically, we use expert-coded cross-national panel data from the V–Dem data set to both conduct real-data comparisons and inform ecologically-motivated simulation studies. We find that IRT approaches outperform simple averages when experts vary in reliability and exhibit di↵erential item functioning (DIF). IRT models are also generally robust even in the absence of simulated DIF or varying expert reliability. Our findings suggest that producers of cross-national data sets should adopt IRT techniques to aggregate expert-coded data of latent concepts.
Strategies of Validation: Assessing the Varieties of Democracy Corruption Data
(2016) McMann, Kelly; Pemstein, Daniel; Seim, Brigitte; Teorell, Jan; Lindberg, Staffan I.; V-Dem Institute
Social scientists face the challenge of determining whether their data are valid, yet they lack prac- tical guidance about how to do so. Existing publications on data validation provide mostly abstract information for creating one’s own dataset or establishing that an existing one is adequate. Fur- ther, they tend to pit validation techniques against each other, rather than explain how to combine multiple approaches. By contrast, this paper provides a practical guide to data validation in which tools are used in a complementary fashion to identify the strengths and weaknesses of a dataset and thus reveal how it can most effectively be used. We advocate for three approaches, each incorporat- ing multiple tools: 1) assessing content validity through an examination of the resonance, domain, differentiation, fecundity, and consistency of the measure; 2) evaluating data generation validity through an investigation of dataset management structure, data sources, coding procedures, aggre- gation methods, and geographic and temporal coverage; and 3) assessing convergent validity using case studies and empirical comparisons among coders and among measures. We apply our method to corruption measures from a new dataset, Varieties of Democracy. We show that the data are generally valid and we emphasize that a particular strength of the dataset is its capacity for analysis across countries and over time. These corruption measures represent a significant contribution to the field because, although research questions have focused on geographic differences and temporal trends, other corruption datasets have not been designed for this type of analysis.
The V–Dem Measurement Model: Latent Variable Analysis for Cross-National and Cross-Temporal Expert-Coded Data
(2022-03) Pemstein, Daniel; Marquardt, Kyle L.; Tzelgov, Eitan; Wang, Yi-ting; Medzihorsky, Juraj; Krusell, Joshua; Miri, Farhad; Römer, Johannes von; V-Dem Institute
The Varieties of Democracy (V–Dem) project relies on country experts who code a host of ordinal variables, providing subjective ratings of latent—that is, not directly observable—regime characteristics over time. Sets of around five experts rate each case (country-year observation), and each of these raters works independently. Since raters may diverge in their coding because of either differences of opinion or mistakes, we require systematic tools with which to model these patterns of disagreement. These tools allow us to aggregate ratings into point estimates of latent concepts and quantify our uncertainty around these point estimates. In this paper we describe item response theory models that can that account and adjust for differential item functioning (i.e. differences in how experts apply ordinal scales to cases) and variation in rater reliability (i.e. random error). We also discuss key challenges specific to applying item response theory to expert-coded cross-national panel data, explain the approaches that we use to address these challenges, highlight potential problems with our current framework, and describe long-term plans for improving our models and estimates. Finally, we provide an overview of the different forms in which we present model output.
The V-Dem Measurement Model: Latent Variable Analysis for Cross-National and Cross-Temporal Expert-Coded Data
(2015) Pemstein, Daniel; Marquardt, Kyle L.; Tzelgov, Eitan; Wang, Yi-ting; Miri, Farhad; V-Dem Institute
The Varieties of Democracy (V–Dem) project relies on country experts who code a host of ordinal variables, providing subjective ratings of latent—that is, not directly observable— regime characteristics over time. Sets of around five experts rate each case (country-year observation), and each of these raters works independently. Since raters may diverge in their coding because of either differences of opinion or mistakes, we require system- atic tools with which to model these patterns of disagreement. These tools allow us to aggregate ratings into point estimates of latent concepts and quantify our uncertainty around these point estimates. In this paper we describe item response theory models that can that account and adjust for differential item functioning (i.e. differences in how experts apply ordinal scales to cases) and variation in rater reliability (i.e. random error). We also discuss key challenges specific to applying item response theory to expert-coded cross-national panel data, explain the approaches that we use to address these challenges, highlight potential problems with our current framework, and describe long-term plans for improving our models and estimates. Finally, we provide an overview of the different forms in which we present model output.
The V-Dem Measurement Model: Latent Variable Analysis for Cross-National and Cross-Temporal Expert-Coded Data
(2020-03) Pemstein, Daniel; Marquardt, Kyle L.; Tzelgov, Eitan; Wang, Yi-ting; Medzihorsky, Juraj; Krusell, Joshua; Miri, Farhad; von Römer, Johannes; V-Dem Institute
The Varieties of Democracy (V-Dem) project relies on country experts who code a host of ordinal variables, providing subjective ratings of latent- that is, not directly observable- regime characteristics over time. Sets of around five experts rate each case (country-year observation), and each of these raters works independently. Since raters may diverge in their coding because of either differences of opinion or mistakes, we require systematic tools with which to model these patterns of disagreement. These tools allow us to aggregate ratings into point estimates of latent concepts and quantify our uncertainty around these point estimates. In this paper we describe item response theory models that can that account and adjust for differential item functioning (i.e. differences in how experts apply ordinal scales to cases) and variation in rater reliability (i.e. random error). We also discuss key challenges specific to applying item response theory to expert-coded cross-national panel data, explain the approaches that we use to address these challenges, highlight potential problems with our current framework, and describe long-term plans for improving our models and estimates. Finally, we provide an overview of the different forms in which we present model output.
What Makes Experts Reliable?
(2018) Marquardt, Kyle L.; Pemstein, Daniel; Seim, Brigitte; Wang, Yi-ting; V-Dem Institute
Many datasets use experts to code latent quantities of interest. However, scholars have not explored either the factors affecting expert reliability or the degree to which these factors influence estimates of latent concepts. Here we systematically analyze potential correlates of expert reliability using six randomly selected variables from a cross-national panel dataset, V-Dem v8. The V-Dem project includes a diverse group of over 3,000 experts and uses an IRT model to incorporate variation in both expert reliability and scale perception into its data aggregation process. In the process, the IRT model produces an estimate of expert reliability, which affects the relative contribution of an expert to the model. We examine a variety of factors that could correlate with reliability, and find little evidence of theoretically-untenable bias due to expert characteristics. On the other hand, there is evidence that attentive and condent experts who have a basic contextual knowledge of the concept of democracy are more reliable.