Data coding and harmonization: How DataCoH and Charmstats are transforming social science data
Kristi Winters
Building: Law Building
Room: Breakout 3 - Law Building, Room 104
Date: 2012-07-10 11:00 AM – 12:30 PM
Last modified: 2012-04-03
Abstract
Comparative social researchers are often confronted with the challenge of making key theoretical concepts comparable across nations and/or time. One example is the socio-demographic variable ‘Education’. To operationalize ‘education’ researchers must review multiple educational systems across nations and/or changing educational structures within one nation across time. Further, researchers have multiple ways to recode education into a harmonized variable including (inter alia): the Hoffmeyer-Zlotnik/Warner matrix; the CASMIN education scheme; the International Standard Classification of Education; or a harmonized variable provided by the dataset itself.
GESIS is developing two electronic resources to assist social researchers. The website DataCoH (Data Coding and Harmonization) will provide a centralized online library of data coding and harmonization for existing variables to increase transparency and variable replication. DataCoH initially will contain socio-demographic variables used across the social sciences and then expand to discipline-specific variables. The software program Charmstats (Coding and Harmonizing Statistics) will provide a structured approach to data harmonization by allowing researchers to: 1) download harmonization protocols; 2) document variable coding and harmonization processes; 3) access variables from existing datasets for harmonization; and 4) create harmonization protocols for publication and citation. This paper explains DataCoH and Charmstats and demonstrates how they work.
GESIS is developing two electronic resources to assist social researchers. The website DataCoH (Data Coding and Harmonization) will provide a centralized online library of data coding and harmonization for existing variables to increase transparency and variable replication. DataCoH initially will contain socio-demographic variables used across the social sciences and then expand to discipline-specific variables. The software program Charmstats (Coding and Harmonizing Statistics) will provide a structured approach to data harmonization by allowing researchers to: 1) download harmonization protocols; 2) document variable coding and harmonization processes; 3) access variables from existing datasets for harmonization; and 4) create harmonization protocols for publication and citation. This paper explains DataCoH and Charmstats and demonstrates how they work.