RC33 Eighth International Conference on Social Science Methodology

Reasons for differences in reliability of process-produced data - The case of educational achievement

Thomas Kruppe, Britta Matthes, Stefanie Unger

Date: 2012-07-12
The reliability of many process-produced data has been distrusted because they originate from administrative sources such as the notification process of the social security system or from several internal procedures of the different agencies. By the example of information on educational achievement we exemplify the problem by utilizing the process-produced research-data provided by the Research Data Centre (FDZ) of the German Federal Employment Agency at the Institute for Employment Research. It is well-known that reliability of process-produced data depends on importance of the recorded information for administrative purposes. For instance, information on educational achievement originate from employer in the notification process of the German social security system is unverified and misreporting has no consequences concerning obligations or claims out of the social security neither for the employer nor for the employee. Comparing this with the same information originate from administrative procedures at Federal Employment Agency, e.g. to file for unemployment or for seeking a job, on an individual level shows that there can be significant differences while it is quite unclear, which source is more reliable. Another example is, that firms of different size notify with different reliability: In very little firms the notification to the social security system were given by the boss himself which knows the educational achievement of the staff in great detail. Therefore we expect a higher reliability of the educational information as, in contrast, in little or middle sized firms because they often mandate accounting firms to notify to the social security system. Since the extern tax advisors bear only little relation to employees of mandatory firms, the reliability of educational achievement there will be less. In contrast, if educational information originates from large firms it should be of higher reliability because they are mostly equipped with a personnel department which gains insight into personnel files to give the correct notification.
By utilizing the new data set called ALWA-ADIAB, we are now able to test the reliability on information of educational achievement directly: A survey data set (ALWA) conducted the individual educational and employment history of respondents in every detail. Process-produced data (ADIAB) of exactly the same individuals were merged by using one-to-one identifiers.