Strategies of Validation: Assessing the Varieties of Democracy Corruption Data
Social scientists face the challenge of determining whether their data are valid, yet they lack prac- tical guidance about how to do so. Existing publications on data validation provide mostly abstract information for creating one’s own dataset or establishing that an existing one is adequate. Fur- ther, they tend to pit validation techniques against each other, rather than explain how to combine multiple approaches. By contrast, this paper provides a practical guide to data validation in which tools are used in a complementary fashion to identify the strengths and weaknesses of a dataset and thus reveal how it can most effectively be used. We advocate for three approaches, each incorporat- ing multiple tools: 1) assessing content validity through an examination of the resonance, domain, differentiation, fecundity, and consistency of the measure; 2) evaluating data generation validity through an investigation of dataset management structure, data sources, coding procedures, aggre- gation methods, and geographic and temporal coverage; and 3) assessing convergent validity using case studies and empirical comparisons among coders and among measures. We apply our method to corruption measures from a new dataset, Varieties of Democracy. We show that the data are generally valid and we emphasize that a particular strength of the dataset is its capacity for analysis across countries and over time. These corruption measures represent a significant contribution to the field because, although research questions have focused on geographic differences and temporal trends, other corruption datasets have not been designed for this type of analysis.