Web-based Documentation of Longitudinal Studies

Marcel Hebing, Jan Goebel, J├╝rgen Schupp

Date: 2012-07-12 01:30 PM – 03:00 PM
As more and more complex data sets become available for social and economic research, thorough documentation of the data structures takes on ever-increasing importance. In the case of the German Socio-Economic Panel (SOEP) study, the questionnaire itself was the main document on the cross-sectional files when the panel started in 1984. In the years that followed, data sets from different waves were combined to increase the study's longitudinal power. This was the origin of 'SOEPinfo v.1', a web-based application with two key functions: First, it presents a systematic and searchable overview of all SOEP questionnaires and data sets, and second, it provides help in creating individual research data sets through the variable basket function and the associated script generator.

The Internet has changed significantly over the last 16 years since the first version of SOEPinfo was released, and so have the scientific world and the SOEP study itself. The SOEP has been expanded to include new data structures, new instruments, and this year, a new 'innovation panel'. The SOEP has also become a part of the Cross National Equivalent File (CNEF), that currently contains data from 7 national household surveys. The 'Web 2.0' facilitates user participation, standardization, and community-building, e.g., in social networks. In the present day, we find new approaches like the 'Semantic Web' and the IT industry is increasingly dominated by the consumer market. In the world of scientific and statistical data, new standards for metadata have been introduced (e.g., SDMX and DDI) and the number of data sets available has grown exponentially, placing our metadata portal in a new and constantly changing context. In the future, it will therefore be of increased relevance to provide metadata not only through isolated portals for individual studies, but through systems that are technically capable of facilitating the comparison of different studies.

The time has come for us to evaluate our past work on 'SOEPinfo v.1', to explore the changes in the scientific community and trends in the IT world, and to address future requirements for a contemporary metadata portal. Therefore, we will focus on three questions: First, what are the requirements for a contemporary metadata portal in general, and particularly in the context of a longitudinal study? Second, what significance do standards like DDI have for the documentation of longitudinal metadata, especially in the context of cross-national comparison? Third, what trends in the fast-moving IT world should be taken into account when implementing a metadata portal today? In conclusion, we will outline a vision for our new metadata portal, 'SOEPinfo v.2'.