Replication files for "Integrating biodiversity: A longitudinal and cross-sectoral analysis of Swiss politics"

Introduction

The ZIP file contains all data and code to replicate the analyses reported in the following paper.

Reber, U., Fischer, M., Ingold, K., Kienast, F., Hersperger, A. M., Grütter, R., & Benz, R. (2022). Integrating biodiversity: A longitudinal and cross-sectoral analysis of Swiss politics. Policy Sciences. https://doi.org/10.1007/s11077-022-09456-4

If you use any of the material included in this repository, please refer to the paper. If you use (parts of) the text corpus, please also refer to the sources used for its compilation listed below. The content of the texts may not be changed.

Data folder

The data folder contains the following files.

  • corpus.parquet: Text corpus of Swiss policy documents
  • dict_de.csv: Biodiversity dictionary (German)
  • dict_fr.csv: Biodiversity dictionary (French)
  • dict_it.csv: Biodiversity dictionary (Italian)
  • topic_labels.csv: labels/codes for policy sectors
  • topics.csv: labels/codes for policy sectors

The corpus and the dictionary were compiled by the authors specifically for this project. The labels/codes for policy sectors are based on the coding scheme of the Swiss Parliament.

Text corpus

The text corpus consists of 439,984 Swiss policy documents in German, French, and Italian from 1999 to 2018. The corpus was compiled from the following source between 2020-10-01 and 2021-01-31.

  • Transcripts and parliamentary businesses (e.g. questions, motions, parliamentary initiatives) via the Web Services (WS) provided by the Swiss Parliament
  • The official compilation of federal legislation ("Amtliche Sammlung", AS) via opendata.swiss provided by the Swiss Federal Archives (SFA)
  • The federal gazette ("Bundesblatt") via fedlex.admin.ch
  • Decisions of federal courts via entscheidsuche.ch (ES)

The corpus is stored in a single data frame to use with R saved as PARQUET file (corpus.parquet). The data frame has the following structure.

  • text_id: Unique identifier for each text (source information as prefix, e.g. "t_")
  • doc_type: Document type (see coding scheme below)
  • branch: Government branche (1 legislative, 2 executive, 3 judicative)
  • stage: Stage of policy process (1 drafting, 2 introduction, 3 interpretation)
  • year: Year of publication
  • topic: Policy sector (coding scheme in separate file in data folder)
  • lang: Language (de, fr, it)
  • text: Text

The following list contains the coding scheme for the doc_type variable.

  • 101: Federal gazette // Draft for public consultation ("Vernehmlassungsverfahren")
  • 102: Federal gazette // Explanation of draft for parliament ("Botschaft")
  • 103: Federal gazette // Strategy, action plan
  • 104: Federal gazette // Federal council decree ("Bundesratsbeschluss")
  • 105: Federal gazette // (Simple) Federal decree ("(Einfacher) Bundesbeschluss")
  • 106: Federal gazette // General decree ("Allgemeinverfügung")
  • 107: Federal gazette // Treaty ("Übereinkommen")
  • 108: Federal gazette // Treaty ("Abkommen")
  • 109: Federal gazette // Draft for parliament ("Entwurf")
  • 110: Federal gazette // Report ("Bericht")
  • 111: Federal gazette // Report of parliamentary comission ("Bericht")
  • 112: Federal gazette // Report of federal council ("Bericht")
  • 201: Parl. businesses // Submitted text
  • 202: Parl. businesses // Reason text
  • 203: Parl. businesses // Federal council response
  • 204: Parl. businesses // Initial situation
  • 205: Parl. businesses // Proceedings
  • 301: Parl. transcripts // Speech of MP
  • 302: Parl. transcripts // Speech of federal council
  • 401: Federal legislation // Legal text of the official compilation (law, ordinances, etc.)
  • 501: Court decisions // Federal Supreme Court
  • 502: Court decisions // Federal Criminal Court
  • 503: Court decisions // Federal Administrative Court

Code folder

The code folder contains all R code for the analyses. The files are numbered chronologically.

  • 1_classifier_training.R: Training of classifiers for classification of policy sectors
  • 2_classifier_application.R: Classification of documents in corpus
  • 3_dictionary_application.R: Biodiversity indexing of documents in corpus
  • 4_stm_truncation.R: Truncation of indexed documents to keep only relevant parts
  • 5_stm_translation.R: Translation of FR and IT documents to DE
  • 6_stm_model.R: Preprocesssing and structural topic model
  • 7_plots.R: Plots and numbers as included in the paper

The code/functions folder contains custom functions used in the scripts, e.g. to support topic model interpretation.

Package versions and setup details are noted in the code files.

Contact

Please direct any questions to Ueli Reber (ueli.reber@eawag.ch).

Funding Information:

This work was supported by:
  • BGB Initiative (BGB 2020)

Related Publications

  • Reber, U., Fischer, M., Ingold, K., Kienast, F., Hersperger, A. M., Grütter, R., & Benz, R. (2022). Integrating biodiversity: A longitudinal and cross-sectoral analysis of Swiss politics. Policy Sciences.

Citation:

Reber, Ueli (2022). Replication files for "Integrating biodiversity: A longitudinal and cross-sectoral analysis of Swiss politics". doi:10.16904/envidat.302.

DataCite ISO 19139 GCMD DIF README.txt BibTex RIS

Data and Resources

Metadata

Field Values
DOI 10.16904/envidat.302
Publication State Published
Authors
  • Email: ueli.reberfoo(at)eawag.ch ORCID: 0000-0001-8036-4493 Given Name: Ueli Family Name: Reber Affiliation: Eawag DataCRediT: Collection, Validation, Curation, Software, Publication
Contact Person Given Name: Ueli Family Name: Reber Email: ueli.reberfoo(at)eawag.ch Affiliation: Eawag ORCID: 0000-0001-8036-4493
Subtitles
Publication Year: 2022
Dates
  • Type: Collected Date: 2020-10-01 End Date: 2021-01-31
Version 1.0
Type dataset
General Type Dataset
Language English
Location Switzerland
Content License Creative Commons Attribution Share-Alike (CC-BY-SA)    [Open Data]
Last Updated April 5, 2022, 14:04 (UTC)
Created February 28, 2022, 10:11 (UTC)